Let us start with a bit of background, shall we? Rsync is a widely used file copying tool, that can operate either locally or between remote systems, through a remote shell or TCP. Most of its usage is for mirroring or backups.
At CybelAngel we greatly improved the way we detect Rsync breaches. We are now able to extensively scan open servers to find all the sensitive information they may contain. Very soon after the deployment of some recent improvements, the number of Rsync data leak alerts we delivered to our customers skyrocketed. Our dear analysts still grieve over the hard time they had processing all of them at first. But why did we get so many? How come so many Rsync servers are exposed?
One might think that Rsync servers are not secure and cannot be trusted to keep sensitive data for backup or any other purpose. It is not as simple as that. When you want to keep something safe, such as your personal belongings you brought to a swimming pool, you might put them in a locker, but you would never leave that locker unlocked, right? If you did, and your belongings were stolen, would you say that the locker was not safe?
This reasoning applies to Rsync, and any other protocol. If you want your data to be safe when you store them through Rsync, then you have to lock it properly, through the proper configuration. By default, Rsync is actually open. No authentication is required and no restriction applies as to the IP addresses that can connect to the server or how many of them can connect simultaneously. This is why we detect so many exposed Rsync servers. By running in the default configuration, Rsync users allow any machine to connect and see what is going on.
Below are a few examples of precautions one can take to ensure a minimum of security using Rsync, but there are many other important Rsync security configurations to consider:
Enable chroot: this ensures that only the chroot directory will be visible to someone connecting to the server, and not the entire file system.
Use hosts allow/deny: this allows you to specify what IP addresses can (allow) or cannot (deny) connect to the Rsync server, so only necessary hosts have access to it.
Set authentication: this allows you to specify usernames or groups along with passwords per user or group.
Despite being exposed, by default Rsync uses SSH protocol for the transport of the data from one end to the other, which provides data encryption. So one should not be able to get the decoded data so easily. However, it seems that the Rsync daemon protocol, which does not provide encryption, is significantly faster. So it may be tempting to use the daemon protocol, but one should bear in mind that the rest of the configuration will not be secure by default. It is up to the user to set the different parameters properly. This is true not only for Rsync itself, but also for the other tools that use Rsync. For example, users often set their Rsync servers through a cloud archive manager. So one has to take time to check that the configuration is secure on both the manager and Rsync side.
Analysing the Rsync data breach alerts we’ve delivered, we find that they often result from the negligence of managed service providers, web agencies, or other third-party professionals. In many professional service companies, data security is not usually a priority. These third parties rarely have big teams dedicated to the security of their data, and they are not well trained in cybersecurity configurations, and accordingly their configurations of Rsync servers are usually insecure. Moreover, cyber threat actors know this, so they will take these third parties as main targets, which leads to the kinds of data leaks we find every day.
Who is in charge of making Rsync servers more secure against data leaks? In many enterprises, there is no clear answer. It’s true that the latest Rsync versions warn users about possible security insufficiencies with their configurations, but the configuration options and the security of them could be made clearer. Third parties could dedicate more resources on the security of their infrastructure too. Enterprises could spend time and money to train the people they work with internally and externally. Every part of the chain must take some responsibilities in dealing with the data securely.