This post is part of a new experiment; I'm toying with the idea of writing a concise and approachable "folk textbook" covering the conceptual basics of what I do; using Linux and other operating systems like MacOS and Windows, home routers, networking software, servers, virtual machines, linux containers, etc. I'm starting out by writing a few drafts or tidbits in this style as blog articles.

If you're new to running server applications or even if you're an experienced application developer who's never studied computer networking (TCP/IP specifically), there's a good chance you use listening addresses all the time without fully understanding how they work or what they're doing.

I definitely spent many years as a professional software developer before I started to learn all the details of how listening addresses work, and I wish I had found a resource like this article earlier.

To make matters worse, it seems like every application has slightly different behaviour when it comes to listening addresses and may parse or represent them differently. Too often these applications are not designed to be easy to use and they don't explain anything; they simply assume that you already know what you are doing.

Why Listening Addresses Matter

When starting to run a new server application such as a webserver or other network service, often we will start it up, try to connect to it, and... No dice. Why isn't it working? Why can't we connect? There are many possible reasons why, and this article aims to explain what could go wrong and how to figure out what's happening.

Our first step: Figure out whether or not the application is listening at all, and if so, what listening address(es) it's using.

Identifying a Listening Address

A listening address is typically writen like a normal network address, that is, it's written as <host>:<port>.

<host> here should almost always be an IP address, and <port> should almost always be a single port number. However, in terms of how they are represented in configuration files and logs, there's no strict standard, and developers ocasionally implement some, shall we say, "creative" representations.

Here are some examples of what I would call "standard" listening addresses that use IPv4 addresses:

  • 127.0.0.1:8080
  • 0.0.0.0:22
  • 123.45.67.8:35871

And here are some that use IPv6 addresses:

  • [::1]:8080
  • [::]:22
  • [2607:fb90:1788:efbd:d3f9:555:4ddb:c8ed]:35871

And here are some that use various shorthand notations or alternate representations:

  • localhost:8080
  • ::1:8080
  • :22
  • *:22
  • :::22
  • example.com:35871
  • 127.0.0.1:3000-4000

Any given server application should write the address it decided to listen on to its log immediately after it starts up. We should be able to connect to the application by connecting to that address. But how would one connect to 0.0.0.0:22 or :22? What do those addresses even mean?

About "Special" IPv4 and IPv6 Addresses

You may already be aware of some "special" IP addresses, for example

127.0.0.1 which is called the loopback address, the address of the localhost domain.

OR

192.168.0.1 which might be the address of your home router.

If you want to learn more about these, I recommend the Wikepedia article on IPv4.

There's another special IPv4 address which (as far as I know) is only used by listening addresses:

0.0.0.0 technically means something like "null" or "no address", but when used as a listening address, it's interpreted as "Listen on ALL Addresses".

Don't ask me why, but IPv6 addresses are written with colons : in between the numbers instead of periods . (in my opinion this was a huge mistake đŸ˜Ģ)

So this special address for IPv6 would be written as 0:0:0:0:0:0:0:0, but you never actually see it written that way. IPv6 also introduced a special shorthand where repeated 0:s are abbreviated as ::. So, confusingly, the IPv6 "Listen on ALL Addresses" IP address is written as ::.

To make matters worse, when an IPv6 address is represented as a part of a network address, we have to be able to tell the difference between the IPv6 colons and the <host>:<port> colon. So those network addresses are written like [::]:22 or [::1]:8080.

In case you were wondering, ::1 (expanded, it would be 0:0:0:0:0:0:0:1) is the IPv6 "loopback address" like IPv4's 127.0.0.1, the address that loops back to the same computer who is dailing.

Rant: A Server Application is Giving Us The Silent Treatment...

Here's a concrete example of just how terrible the user interface and usability of server software tends to be. The following is based on a true story...


The venerable web server nginx (pronounced "engine-x" đŸ¤Ļ) doesn't log anything at all when we run it:

$ sudo /usr/sbin/nginx
$ 

In fact, it exits immediately (gives me my command prompt back) as if it refused to run or perhaps crashed. So what gives? Lets check which process exit code it outputted. (For process exit codes, 0 (Zero) means success, and any positive number means failure).

$ sudo /usr/sbin/nginx
$ echo $?
0

It returned 0, so it says that it worked, what gives? We get a clue if we try running it again:

$ sudo /usr/sbin/nginx
nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
nginx: [emerg] bind() to [::]:80 failed (98: Address already in use)

This time it's complaining that it can't listen on port 80 because some other process is already listening on port 80... But it's not referring to it as a "port", it's referring to it as an "address". We've spotted our first listening addresses, 0.0.0.0:80 and [::]:80, in the wild! In this case, it looks like nginx is trying and failing to listen on port 80 on all IPv4 and IPv6 addresses.

It turns out that by default, nginx runs in "daemon" mode. And by default, it doesn't log anything while it does this. That means when I ran sudo /usr/sbin/nginx and it appeared to have cancelled or failed, what actually happened was the nginx server spawned in a separate process. I never saw any output because my shell was not attached to that other process. That's also why it complained about Address already in use the second time I ran it: it was trying to start up a second copy of the server while the first one was still running.

After a bit of looking things up online, I returned with the following command line options for nginx:

$ sudo /usr/sbin/nginx -g 'daemon off; error_log stderr info;'
2022/01/30 21:54:21 [notice] 431278#431278: using the "epoll" event method
2022/01/30 21:54:21 [notice] 431278#431278: nginx/1.18.0 (Ubuntu)
2022/01/30 21:54:21 [notice] 431278#431278: OS: Linux 5.4.0-96-generic
2022/01/30 21:54:21 [notice] 431278#431278: getrlimit(RLIMIT_NOFILE): 1024:1048576
2022/01/30 21:54:21 [notice] 431278#431278: start worker processes
2022/01/30 21:54:21 [notice] 431278#431278: start worker process 431279
2022/01/30 21:54:21 [notice] 431278#431278: start worker process 431280
2022/01/30 21:54:21 [notice] 431278#431278: start worker process 431281
...

The -g flag stands for "-global directives", it allows me to add a couple extra configurations on top of the existing configuration file.

The daemon off; part tells it to run like a normal process instead of in "daemon mode" (That is, run in my shell, output its log to the shell so we can see what it's doing).

Finally, error_log stderr info; tells it to output its error log to the process' stderr stream, which will be displayed in the shell.

However, it still doesn't log the port(s) it's listening on 😒

IMO, this should be embarrassing for the developers/maintainers of nginx; that someone trying to use thier software couldn't even tell if it was running or not. Thier software did not report anything at all when it was executed. Even after traversing the first usability gap, the user still doesn't know whether or not its listening on some port, and if so, which port it's listening on.

But they probably aren't embarrassed at all; they probably view this whole kerfluffle as "the user's fault" for being ignorant; for not already knowing how nginx works or how to answer thier questions on their own. In my opinion, that's a toxic take. The whole point is that we are trying to learn about nginx and server software in general. It is nginx that is at fault here for making this such a painful experience; it doesn't have to be this way IMO. There is plenty of other server oriented software which manages to keep its command line interface approachable.

It turns out that nginx DOES in fact log the ports / addresses it's listening on, however, it logs them at the debug log level, so they are hidden by default unless we specify that we want to see the log at the debug log level instead of the info log level. error_log stderr debug instead of error_log stderr info:

forest@thingpad:~$ sudo /usr/sbin/nginx -g 'daemon off; error_log stderr debug;'
2022/07/26 12:08:21 [debug] 19486#19486: bind() 0.0.0.0:80 #6 
2022/07/26 12:08:21 [debug] 19486#19486: bind() [::]:80 #7 
2022/07/26 12:08:21 [notice] 19486#19486: using the "epoll" event method
2022/07/26 12:08:21 [debug] 19486#19486: counter: 00007F4CAA115080, 1
2022/07/26 12:08:21 [notice] 19486#19486: nginx/1.18.0 (Ubuntu)
2022/07/26 12:08:21 [notice] 19486#19486: OS: Linux 5.4.0-122-generic
2022/07/26 12:08:21 [notice] 19486#19486: getrlimit(RLIMIT_NOFILE): 1024:1048576
2022/07/26 12:08:21 [debug] 19486#19486: write: 8, 00007FFFEC8AE470, 6, 0
2022/07/26 12:08:21 [debug] 19486#19486: setproctitle: "nginx: master process /usr/sbin/nginx -g daemon off; error_log stderr debug;"
2022/07/26 12:08:21 [notice] 19486#19486: start worker processes

There are those two listening addresses, 0.0.0.0:80 and [::]:80, again! Last time we saw them in an error message, but this time we see them in a debug log. Finally, nginx is running properly and it's also at least sort of letting us know what it's doing.

In my opinion, an application with a proper user interface wouldn't run in daemon mode by default, and it would log something like this when it starts up.

nginx is starting up!
...
I am now listening publicly on 0.0.0.0:80 and [::]:80
You can connect to me at http://localhost/

Or if it was going to run in daemon mode by default, it would log something like this when run:

I am now spawning the nginx daemon process. 
...
Startup succeeded, nginx daemon is now running!
If you would like to see the output of that process, please check the log file /var/log/nginx.log


Flipping the Table (â•¯Â°â–ĄÂ°)╯ī¸ĩ â”ģ━â”ģ

Luckily, we don't have to depend on applications like nginx which may be "unreliable narrators" at times.

We can always just ask the operating system what network addresses are listening and which application process those "listening sockets" are associated with. Use sudo ss -tulpn or sudo netstat -tulpn on MacOS/Linux and netstat -aon on Windows.

ℹī¸ NOTE: Those flags stand for -tcp, -udp, -listening, inlcude -program name, and display port number as a -number instead of listing the associated protocol. sudo is included because the -program name flag only works when netstat/ss are running as root)

On my computer, the netstat and ss commands outputted too much text to copy and paste here, so I've edited the output down a bit:

Netid  State    Recv-Q  Send-Q  Local Address:Port   Peer Address:Port  Process
...
tcp    LISTEN   0       4096        127.0.0.1:35625       0.0.0.0:*     containerd
tcp    LISTEN   0       50            0.0.0.0:139         0.0.0.0:*     smbd
tcp    LISTEN   0       4096          0.0.0.0:111         0.0.0.0:*     rpcbind
tcp    LISTEN   0       511           0.0.0.0:80          0.0.0.0:*     nginx
tcp    LISTEN   0       32      192.168.122.1:53          0.0.0.0:*     dnsmasq
...
tcp    LISTEN   0       250              [::]:3142           [::]:*     apt-cacher-ng
tcp    LISTEN   0       50               [::]:139            [::]:*     smbd
tcp    LISTEN   0       4096             [::]:111            [::]:*     rpcbind
tcp    LISTEN   0       511              [::]:80             [::]:*     nginx
tcp    LISTEN   0       5               [::1]:631            [::]:*     cupsd
...  

Here we can obtain the same information: There are two listening sockets owned by the nginx process, and they have the Local Address:Ports 0.0.0.0:80 and [::]:80. In otherwords, nginx is listening on port 80 on all IPv4 and IPv6 addresses.

How Listening Addresses Work

The 0.0.0.0:80 and [::]:80 example above demonstrates the simplest case for a listening address.

With those two listening sockets, if anyone dials that computer from anywhere, literally anywhere, they will be able to connect on that port (:80). Of course, there are caveats. The listening computer may not be routable from everywhere. Or a firewall might be blocking the traffic. The firewall could be anywhere; on the dialing computer (client), the listening computer (server) or some equipment in-between.

At any rate, when they connect, the listening computer's operating system will notify the process attached to the listening socket, thus triggering the connection handler code to run inside the server application process that created the listening socket.

A listening address like 127.0.0.1:80 or [::1]:80 is more complicated. It's a real address, so when a process asks the operating system to listen on this address, the OS doesn't mess around: it listens exactly only on that port on that address. This can quickly become a problem, in fact I would say this is probably the most common source of pain that humans encounter when it comes to listening addresses.

The address can only be dialed via the network that it is a part of. (See the List of IPv4 Networks that are part of the IPv4 specification).

127.0.0.1 is in the 127.0.0.0/8 loopback network range, so in order to be able to dial it, you have to be on the loopback network. And by definition, the loopback network only has one computer on it: The computer who dialed! This network is used exclusively for a computer to dial itself.

💀 If you are morbidly curious what the heck the /8 in 127.0.0.0/8 is, + Show

So if your application is listening on 127.0.0.1:80 and [::1]:80, then it will only accept connections coming from the same computer.

The same effect can be achieved for any other network by listening on whatever address your computer has on that network. For example, my home LAN's network is 192.168.0.0/24 and my server's address on that network is 192.168.0.24:

root@odroidxu4:~# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
4: enx001e0636dda6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:1e:06:36:dd:a6 brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.24/24 brd 192.168.0.255 scope global dynamic noprefixroute enx001e0636dda6
       valid_lft 3117sec preferred_lft 3117sec
    inet6 fe80::21a3:f80e:be6a:e049/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

So if I told an application running on that server to listen on 192.168.0.24:80, it would only be dial-able by computers on my home LAN.

Typically this behaviour is only utilized by choosing to listen on either 0.0.0.0 and :: (all addresses) or listen on 127.0.0.1 and ::1 (listen for local connections only).

But it's important to choose the right one. You may wish to listen for local connections only for security reasons, or you may wish to listen on all addresses because you have something you want to share with the world.

So if your server software isn't responding, you can use ss or netstat to figure out if it's listening at all and if so, what listening addresses it's using. It's possible that it may be listening on the wrong address:

  • Listening only on the loopback address while someone from outside is trying to connect
    • This is a common problem with docker containers. Processes running inside docker containers shouldn't listen on the loopback address; if they do they will only be accessible from inside that container.
  • Listening only on IPv6 while someone is trying to connect via IPv4
  • Listening only on IPv4 while someone is trying to connect via IPv6

Dual-Stack Sockets

I just said that sometimes the application might be

Listening only on IPv6 while someone is trying to connect via IPv4

It's worth noting that this is supposed to never happen.

Linux specifically uses something called "dual stack sockets" by default, controlled by the net.ipv6.bindv6only sysctl. This means that if you create a listener for IPv6 connections on all addresses, it should "automagically" listen for IPv4 connections on all addresses as well. This behaviour appears to be fairly well-standardized across operating systems.

In my previous nginx example, nginx is actually going out of its way to avoid using a dual-stack socket; in its default configuration

# Default server configuration
#
server {
	listen 80 default_server;
	listen [::]:80 default_server;

it very clearly specifies to the OS that it wants to create one socket for IPv4 and one socket for IPv6.

However, if I write my own program in Golang or Node.js to create a listening server:

📄 main.go

package main

import (
	"net/http"
)

func main() {
	http.ListenAndServe(":8080", nil)
}

📄 index.js

"use strict";

const express = require("express");

const app = express();

app.listen(8080);

Then I'll only see one listening socket, even though I should be able to dial the app via both IPv4 and IPv6:

forest@thingpad:~$ sudo ss -tlpn 
State     Recv-Q  Send-Q  Local Address:Port  Peer Address:Port  Process  
...
LISTEN    0       511     *:8080              *:*                node
...

forest@thingpad:~$ curl 192.168.0.46:8080
<!DOCTYPE html>
<html lang="en">...blahblahhtmlblah...

forest@thingpad:~$ curl '[::1]:8080'
<!DOCTYPE html>
<html lang="en">...blahblahhtmlblah..

It's worth noting that netstat on linux won't provide any indication that this is happening, it will display the dual-stack listening socket as if it was a normal IPv6 listening socket. However, the newer program ss will differentiate between the two. Also, ss uses the unambiguous IPv6 address format [::]:8080 over the messier :::8080. So ss is definitely preferred.

Compare the netstat output:

forest@thingpad:~$ sudo netstat -tlpn 
Active Internet connections (only servers)
Proto Recv-Q Send-Q  Local Address  Foreign Address  State    PID/Program name    
...            
tcp        0      0  0.0.0.0:80     0.0.0.0:*        LISTEN   1567/nginx               
tcp6       0      0  :::80          :::*             LISTEN   1567/nginx      
...
tcp6       0      0  :::8080        :::*             LISTEN   35378/node    
...

Versus the ss output:

forest@thingpad:~$ sudo ss -tlpn 
State     Recv-Q  Send-Q  Local Address:Port  Peer Address:Port  Process                                                                                                                                               
...
LISTEN    0       511     0.0.0.0:80          0.0.0.0:*          nginx       
LISTEN    0       511     [::]:80             [::]:*             nginx
...
LISTEN    0       511     *:8080              *:*                node
...

It's rare, but sometimes something will go wrong with a dual-stack socket and we'll wish that we could split it up like how nginx does. I've usually encountered this when using uncommon software or in a funky operating system environment. Some examples:

  • I was unable to get gpsd to listen on both IPv4 and IPv6 at the same time on my Odroid single board computer.
    • To solve this, I ended up configuring it to listen on IPv4 only, and I wrote a simple UDP proxy server to listen on IPv6 and forward packets to it.
  • Once upon a time I encountered an issue where Caddy Server appeared to have dual-stack-related listening issues on an alpine linux capsul.
  • A fellow cyberian reported issues with dual-stack listening sockets on an Ubuntu WSL virtual machine. The issue only occurred when dialing the host.docker.internal from inside a docker container.
    • To solve this, the cyberian switched the server application from listening on [::]:1337 to listening on 0.0.0.0:1337.

Theory, Meet Practicality

When running a server application that opens a listening port, you might not even be able to specify which address it listens on. Or even if you could specify the address, you may not be able to specify multiple addresses or specify the details of how the application asks the operating system to listen. (For example, what value it assigns to a socket option flag like IPV6_V6ONLY)

In fact, operating systems or environments can be slightly different in how they handle these listen syscalls. Programing language standard libraries / runtimes differ in how they make the syscalls to request listening sockets as well. So even as an application developer, you may not always have the luxury of specifying exactly how your application listens for connections.

However, despite any limitations that may exist, understanding some of these details should help your server software efforts bear fruit and may help you avoid frustration / disapointment. By understanding the inner workings of any limitation you may run up against, you open up the possibility of side-stepping it with a quick hack or even directly addressing/eliminating it. Happy hosting!

Comments