Talking to Postgres Through Java 16 Unix-Domain Socket Channels

Posted at Jan 31, 2021

Update Feb 5: This post is discussed on Hacker News

Reading a blog post about what’s coming up in JDK 16 recently, I learned that one of the new features is support for Unix domain sockets (JEP 380). Before Java 16, you’d have to resort to 3rd party libraries like jnr-unixsocket in order to use them. If you haven’t heard about Unix domain sockets before, they are "data communications [endpoints] for exchanging data between processes executing on the same host operating system". Don’t be put off by the name btw.; Unix domain sockets are also supported by macOS and even Windows since version 10.

Databases such as Postgres or MySQL use them for offering an alternative to TCP/IP-based connections to client applications running on the same machine as the database. In such scenario, Unix domain sockets are both more secure (no remote access to the database is exposed at all; file system permissions can be used for access control), and also more efficient than TCP/IP loopback connections.

A common use case are proxies for accessing Cloud-based databases, such as as the GCP Cloud SQL Proxy. Running on the same machine as a client application (e.g. in a sidecar container in case of Kubernetes deployments), they provide secure access to a managed database, for instance taking care of the SSL handling.

My curiousity was piqued and I was wondering what it’d take to make use of the new Java 16 Unix domain socket for connecting to Postgres. It was your regular evening during the pandemic, without much to do, so I thought "Let’s give this a try". To have a testing bed, I started with installing Postgres 13 on Fedora 33. Fedora might not always have the latest Postgres version packaged just yet, but following the official Postgres instructions it is straight-forward to install newer versions.

In order to connect with user name and password via a Unix domain socket, one small adjustment to /var/lib/pgsql/13/data/pg_hba.conf is needed: the access method for the local connection type must be switched from the default value peer (which would try to authenticate using the operating system user name of the client process) to md5.

...
# TYPE  DATABASE        USER            ADDRESS                 METHOD
# "local" is for Unix domain socket connections only
local   all             all                                     md5
...

Make sure to apply the changed configuration by restarting the database (systemctl restart postgresql-13), and things are ready to go.

The Postgres JDBC Driver

The first thing I looked into was the Postgres JDBC driver. Since version 9.4-1208 (released in 2016) it allows you to configure custom socket factories, a feature which explicitly was added considering Unix domain sockets. The driver itself doesn’t come with a socket factory implementation that’d actually support Unix domain sockets, but a few external open-source implementations exist. Most notably junixsocket provides such socket factory.

Custom socket factories must extend javax.net.SocketFactory, and their fully-qualified class name needs to be specified using the socketFactory driver parameter. So it should be easy to create SocketFactory implementation based on the new UnixDomainSocketAddress class, right?

public class PostgresUnixDomainSocketFactory extends SocketFactory {

  @Override
  public Socket createSocket() throws IOException {
    var socket = new Socket();
    socket.connect(UnixDomainSocketAddress.of(
        "/var/run/postgresql/.s.PGSQL.5432")); (1)
    return socket;
  }

  // other create methods ...
}

1	Create a Unix domain socket address for the default path of the socket on Fedora and related systems

It compiles just fine; but it turns out not all socket addresses are equal, and java.net.Socket only connects to addresses of type InetSocketAddress (and the PG driver maintainers seem to sense some air of mystery around these "unusual" events, too):

org.postgresql.util.PSQLException: Something unusual has occurred to cause the driver to fail. Please report this exception.
  at org.postgresql.Driver.connect(Driver.java:285)
  ...

Caused by:
java.lang.IllegalArgumentException: Unsupported address type
  at java.base/java.net.Socket.connect(Socket.java:629)
  at java.base/java.net.Socket.connect(Socket.java:595)
  at dev.morling.demos.PostgresUnixDomainSocketFactory.createSocket(PostgresUnixDomainSocketFactory.java:19)
  ...

Now JEP 380 solely speaks about SocketChannel and stays silent about Socket; but perhaps obtaining a socket from a domain socket channel works?

public Socket createSocket() throws IOException {
  var sc = SocketChannel.open(UnixDomainSocketAddress.of(
      "/var/run/postgresql/.s.PGSQL.5432"));
  return sc.socket();
}

Nope, no luck either:

java.lang.UnsupportedOperationException: Not supported
  at java.base/sun.nio.ch.SocketChannelImpl.socket(SocketChannelImpl.java:226)
  at dev.morling.demos.PostgresUnixDomainSocketFactory.createSocket(PostgresUnixDomainSocketFactory.java:17)

Indeed it looks like JEP 380 is concerning itself only with the non-blocking SocketChannel API, while users of the blocking Socket API do not get to benefit from it. It should be possible to create a custom Socket implementation based on the socket channel support of JEP 380, but that’s going beyond the scope of my little exploration.

The Vert.x Postgres Client

If the Postgres JDBC driver doesn’t easily benefit from the JEP, what about other Java Postgres clients then? There are several non-blocking options, including the Vert.x Postgres client and R2DBC. The former is used to bring Reactive capabilities for Postgres into the Quarkus stack, too, so I turned my attention to it.

Now the Vert.x Postgres Client already has support for Unix domain sockets, by means of adding the right Netty native transport dependency to your project. So purely from functionality perspective, there’s not that much to be gained here. But being able to use domain sockets also with the default NIO transport would still be nice, as it means one less dependency to take care of. So I dug a bit into the code of the Postgres client and Vert.x itself and figured out, that two things needed adjustment:

The NIO-based Transport class of Vert.x needs to learn about the fact that SocketChannel now also supports Unix domain sockets (currently, an exception is raised when trying to use them without a Netty native transport)
Netty’s NioSocketChannel needs some small changes, as it tries to obtain a Socket from the underlying SocketChannel, which doesn’t work for domain sockets as we’ve seen above

Step 1 was quickly done by creating a custom sub-class of the default Transport class. Two methods needed changes: channelFactory() for obtaining a factory for the actual Netty transport channel, and convert() for converting a Vert.x SocketAddress into a NIO one:

public class UnixDomainTransport extends Transport {

  @Override
  public ChannelFactory<? extends Channel> channelFactory(
        boolean domainSocket) {

    if (!domainSocket) { (1)
      return super.channelFactory(domainSocket);
    }
    else {
      return () -> {
          try {
            var sc = SocketChannel.open(StandardProtocolFamily.UNIX); (2)
            return new UnixDomainSocketChannel(null, sc);
          }
          catch(Exception e) {
            throw new RuntimeException(e);
          }
        };
    }
  }

  @Override
  public SocketAddress convert(io.vertx.core.net.SocketAddress address) {
    if (!address.isDomainSocket()) { (3)
      return super.convert(address);
    }
    else {
      return UnixDomainSocketAddress.of(address.path()); (4)
    }
  }
}

1	Delegate creation of non domain socket factories to the regular NIO transport implementation
2	This channel factory returns instances of our own `UnixDomainSocketChannel` type (see below), passing a socket channel based on the new `UNIX` protocol family
3	Delegate conversion of non domain socket addresses to the regular NIO transport implementation
4	Create a `UnixDomainSocketAddress` for the socket’s file system path

Now let’s take a look at the UnixDomainSocketChannel class. I was hoping to get away again with creating a sub-class of the NIO-based implementation, io.netty.channel.socket.nio.NioSocketChannel in this case. Unfortunately, though, the NioSocketChannel constructor invokes the taboo SocketChannel#socket() method. Of course that’d not be a problem when doing this change in Netty itself, but for my little exploration I ended up copying the class and doing the required adjustments in that copy. I ended up doing two small changes:

Avoiding the call to SocketChannel#socket() in the constructor:

public UnixDomainSocketChannel(Channel parent, SocketChannel socket) {
    super(parent, socket);
    config = new NioSocketChannelConfig(this, new Socket()); (1)
}

1	Passing a dummy socket instead of `socket.socket()`, it shouldn’t be accessed in our case anyways

A few methods call the Socket methods isInputShutdown() and isOutputShutdown(); those should be possible to be by-passed by keeping track of the two shutdown flags ourselves
As I was creating the UnixDomainSocketChannel in my own namespace instead of Netty’s packages, a few references to the non-public method NioChannelOption#getOptions() needed commenting out, which again shouldn’t be relevant for the domain socket case

You can find the complete change in this commit. All in all, not exactly an artisanal piece of software engineering, but the little hack seemed good enough at least for taking a quick glimpse at the new domain socket support. Of course a real implementation could be done much more properly within the Netty project itself.

So it was time to give this thing a test ride. As we need to configure the custom Transport implementation, retrieval of a PgPool instance is a tad more verbose than usual:

PgConnectOptions connectOptions = new PgConnectOptions()
    .setPort(5432) (1)
    .setHost("/var/run/postgresql")
    .setDatabase("test_db")
    .setUser("test_user")
    .setPassword("topsecret!");

PoolOptions poolOptions = new PoolOptions()
    .setMaxSize(5);

VertxFactory fv = new VertxFactory();
fv.transport(new UnixDomainTransport()); (2)
Vertx v = fv.vertx();

PgPool client = PgPool.pool(v, connectOptions, poolOptions); (3)

1	The Vert.x Postgres client constructs the domain socket path from the given port and path (via `setHost()`); the full path will be /var/run/postgresql/.s.PGSQL.5432, just as above
2	Construct a `Vertx` instance with the custom transport class
3	Obtain a PgPool instance using the customized `Vertx` instance

We then can can use the client instance as usual, only that it now will connect to Postgres using the domain socket instead of via TCP/IP. All this solely using the default NIO-based transports, without the need for adding any Netty native dependency, such as its epoll-based transport.

I haven’t done any real performance benchmark at this point; in a quick ad-hoc test of executing a trivial SELECT query on a primay key 200,000 times, I observed a latency of ~0.11 ms when using Unix domain sockets — with both, netty-transport-native-epoll and JDK 16 Unix domain sockets — and ~0.13 ms when connecting via TCP/IP. So definitely a significant improvement which can be a deciding factor for low-latency use cases, though in comparison to other reports, the latency reduction of ~15% appears to be at the lower end of the spectrum.

Some more sincere performance evaluation should be done, for instance also examining the impact on garbage collection. And it goes without saying that you should only trust your own measurements, on your own hardware, based on your specific workloads, in order to decide whether you would benefit from domain sockets or not.

Other Use Cases

Database connectivity is just one of the use cases for domain sockets; highly performant local inter-process communication comes in handy for all kinds of use cases. One which I find particularly intriguing is the creation of modular applications based on a multi-process architecture.

When thinking of classic ~~Java~~ Jakarta EE application servers for instance, you could envision a model where both the application server and each deployment are separate processes, communicating through domain sockets. This would have some interesting advantages, such as stricter isolation (so for instance an OutOfMemoryError in one deployed application won’t impact others) and re-deployments without any risk of classloader leaks, as the JVM of an deployment would be restarted. On the downside, you’d be facing a higher overall memory consumption (although that can at least partly be mitigated through class data sharing, which also works across JVM boundaries) and more costly (remote) method invocations between deployments.

Now the application server model has fallen out of favour for various reasons, but such multi-process design still is very interesting, for instance for building modular applications that should expose a single web endpoint, while being assembled from a set of processes which are developed and deployed by several, independent teams. Another use case would be desktop applications that are made up of a set of processes for isolation purposes, as it’s e.g. done by most web browsers noawadays with distinct processes for separate tabs. JEP 380 should facilitate this model when creating Java applications, e.g. considering rich clients built with JavaFX.

Another, really interesting feature of Unix domain sockets is the ability to transfer open file descriptors from one process to another. This allows for non-disruptive upgrades of server applications, without dropping any open TCP connections. This technique is used for instance by Envoy Proxy for applying configuration changes: upon a configuration change, a second Envoy instance with the new configuration is started up, takes over the active sockets from the previous instance and after some "draining period" triggers a shutdown of the old instance. This approach enables a truly immutable application design within Envoy itself, with all its advantages, without the need for in-process configuration reloads. I highly recommend to read the two posts linked above, they are super-interesting.

Unfortunately, JEP 380 doesn’t seem to support file descriptor transfers. So for this kind of architecture, you’d still have to refrain to the aforementioned junixsocket library, which explicitly lists file transcriptor transfer support as one of its features. While you couldn’t take advantage of that using Java’s NIO API, it should be doable using alternative networking frameworks such as Netty. Probably a topic for another blog post on another one of those pandemic weekends ;)

And that completes my small exploration of Java 16’s support for Unix domain sockets. If you want to do your own experiments of using them to connect to Postgres, make sure to install the latest JDK 16 EA build and grab the source code of my experimentation from this GitHub repo.

It’d be my hope that frameworks like Netty and Vert.x make use of this JDK feature fairly quickly, as only a small amount of code changes is required, and users get to benefit from the higher performance of domain sockets without having to pull in any additional dependencies. In order to keep compatibility with Java versions prior to 16, multi-release JARs offer one avenue for integrating this feature.

Gunnar Morling

Random Musings on All Things Software Engineering