Security and MQTT
Andrew Schofield 120000NJ3J firstname.lastname@example.org | | Tags:  mqtt iot security internet_of_things
3 Comments | 5,852 Visits
In this blog, I’m going to talk about security and MQTT. At the end of the blog, you should have a good understanding of the kind of things you need consider to secure an MQTT-based system appropriately.
The Internet of Things is growing as more and more devices become connected to the internet. A lot has been written recently about security and the Internet of Things, much of it highlighting mistakes people have made when trying to secure their systems. Cars, household appliances, fitness gadgets, games consoles, smart meters and a lot more can communicate over the network. So, getting this stuff right is important and getting it wrong gets noticed.
Many of the security technologies used with MQTT are also used with HTTP but, because MQTT is a client-server publish/subscribe messaging protocol, there are some additional things to think about too such as the capacity of the MQTT server to buffer messages.
MQTT is used in a wide variety of environments ranging from enthusiasts automating their homes to critical infrastructure for smarter cities. The security requirements for these environments vary widely. You might be concerned about privacy of data and making sure that the publishers and subscribers are trusted. You’ll also want to ensure that the messaging servers remain dependable no matter how they are used or misused.
If you're using MQTT to help monitor and control something like a domestic greenhouse as a hobby, you could argue that security is unimportant. You run the risk of someone else receiving data published by your sensors, sending messages to change the environment inside your greenhouse, or even trying to attack the MQTT server so your high-tech greenhouse becomes a low-tech greenhouse. If it's just a hobby, the risks are perhaps acceptable. Personally, I’d tend to agree in this case, provided that you don’t inadvertently give public remote control to your water supply. But, if your passion is growing prize-winning fruit, you might take these matters more seriously. Some people will do anything to grow a bigger melon.
If you're using MQTT to publish the scores from an international sports tournament to web browsers, you're probably comfortable with any web browser connecting to receive the scores. But, I expect you'd be keen to ensure that only the sports tournament officials can publish the scores. So, for a publicly accessible server, there could be different security considerations for trusted clients on an internal network compared with those on the public internet.
The MQTT protocol is sometimes criticized for lacking security features directly in the protocol, but it is a conscious decision. Each deployment has its own security requirements. Most deployments of MQTT make use of transport layer security, so the data is encrypted and its integrity validated without encryption being a feature of MQTT itself. Similarly, most uses of MQTT also make use of authorization features in the MQTT server to control access, but you won’t find access-control lists in the protocol.
There are several aspects to think about when trying to secure an MQTT system.
Transport Layer Security
Transport Layer Security (TLS) is a cryptographic protocol which provides communication security. The data is encrypted to prevent eavesdropping and checked for integrity to prevent tampering. It’s the technology used to provide secure access to web sites for things like internet banking. The earlier versions went by the name SSL, and that seems to have stuck since many people still use the earlier name.
When using TLS, the server has a certificate which the client validates to ensure that it's connecting to a server it trusts. As a connection is established, the client and server exchange information to create a secure, encrypted connection between them. This secure connection can then be used to transport MQTT.
Using TLS in this way gives privacy to the data but, beyond checking that the clients can support acceptably strong ciphers, the server doesn’t know anything about the clients. It is also possible to demand that the clients have certificates so that the server can also trust the clients. This option, called mutual authentication, is not always used because it brings along with it the requirement to manage the distribution of certificates to the clients.
With TLS comes a performance cost. Firstly, the cost of establishing a secure connection is much higher than an insecure connection, both in terms of data exchanged between the client and server and in terms of processing. For connections which come and go frequently, this will put a burden on the server. Secondly, the encryption of the payload adds a performance burden. However, most commercial MQTT systems do use TLS to get the benefits it brings.
An MQTT client can provide a user name and password at the time that it connects. It is important to use a secure, encrypted connection when using the MQTT user name and password so that these credentials are not transmitted in the clear. The MQTT server is responsible for checking the user name and password. It’s likely that the supplied identity will be used for authorization purposes.
If you want to lock down the MQTT network very tightly, one option is to use mutual authentication and generate the clients’ certificates yourself. In doing this, you’re acting as a certificate authority (CA) for those client certificates. You could then give each client its own identity as part of its certificate and use this identity as part of the authentication and authorization of the clients. This makes it very hard to impersonate a client; you would need its certificate. If you do somehow manage to steal a certificate, you can only impersonate that particular client.
Sometimes MQTT is used with authorization frameworks such as OAuth 2.0. OAuth 2.0 enables separation of the authorization server from a resource server, such as an MQTT server. Sometimes companies use this to centralise checking of credentials in their infrastructure.
When using OAuth 2.0, the user presents their credentials to the authorization server which performs the authentication check and returns an access token encapsulating permission to access a resource. The access token is then passed to the resource server. The resource server validates the access token, usually by communicating with the authorization server, and then grants access to the resource. Using the access token, the user has securely gained access to a resource without presenting credentials to the resource server.
What does this means in MQTT terms? This diagram will help explain.
In this simplified example, the access token is encapsulating authorization to connect to the MQTT server. In principle, the access token could give authorization to publish and subscribe on specific topics too.
MQTT is a topic-based publish/subscribe protocol. Every message is published on a named topic and every subscription has a topic filter which may include wildcards. So, authorization is in terms of publishing, subscribing and topic names. Most MQTT servers have some way of granting authority to publish and subscribe on topics. How authority is actually granted varies from server to server, but it’s usually done using some kind of access-control lists or policies. For example:
A server is going to have a maximum number of connections that it can sustain. If you’re in control of the clients and the server, you can ensure that the clients do not exceed the capabilities of the server. If you’re not in control of the clients, you may be able to set a limit for the number of concurrent connections which keeps the server within its operating capacity.
A server is also going to have some limit for the amount of message data it can hold at once. The MQTT specification permits messages up to 256 megabytes in size. Most devices cannot handle such large messages though. Think, for example, about the effect of a 256-megabyte message arriving on a mobile phone. If you are not in control of the publishers of messages, you may be able to set a limit for the maximum size of messages that can be published so that you can be sure that all of the messages in your system are a reasonable size.
Each subscription will consume some resources, both in terms of memory and in terms of matching published messages with subscriptions. A very straightforward way for a malicious client to consume resources on a server is to create a massive number of subscriptions. A given client can only have one subscription on any particular topic string, so it’s a good idea to authorize untrusted clients to subscribe to only the specific topic strings you’d like them to use.
Retained messages provide a way to ask the server to remember the most recent message published on a topic. But because of this, a malicious client who publishes retained messages on a wide range of topics can consume a lot of resources. To prevent this, use authorization to restrict untrusted clients to publish messages to only the specific topic strings you’d like them to use.
Structuring your topics
In the first bullet above, you can see the power of using replacement variables in authorisation rules. Pattern-based authorisation like this means you can apply a small number of authorisation rules to a very wide number of clients.
CleanSession and Client IDs
Finally, MQTT has a slightly unusual feature called the client ID. The idea is that every client provides a unique identifier as it connects. For CleanSession=0 clients, this is used by the server to resume a session when the client reconnects. The unusual part is the behaviour when a client connects using a client ID that’s already being used. The server disconnects the existing connection and lets the new connection take over. MQTT is designed like this because it’s expecting clients to use unique, assigned identifiers. If a client connects to find its client ID already in use, it’s because that same client disconnected and has reconnected before the server has had a chance to clean up.
In environments where client are not assigned unique identifiers, this behaviour might be inconvenient. For example, a central publisher of information might use client ID “ResultsPublisher”. If another client tries to use exactly the same client ID, the central publisher will be disconnected. In a trusted environment, you could use authentication to ensure that this doesn’t happen. In an untrusted environment, such as the public internet, there doesn’t seem to be much value to providing a client ID at all. This is why MQTT 3.1.1 permits CleanSession=1 clients to connect without providing a client ID. If I was deploying MQTT 3.1.1 on the public internet, I would only accept connections from CleanSession=1 clients with no client ID.
Tying it all together
Each dishwasher is assigned a unique identifier which it uses as its MQTT client ID. The connections use CleanSession=0 so that the MQTT server remembers the dishwashers’ sessions and subscriptions even when they are disconnected. This allows reliable message delivery to be resumed when they reconnect and also means that messages matching the dishwashers’ subscriptions while they are not connected are queued up on the MQTT server awaiting reconnection.
If you want to accept connections from untrusted clients that you do not control, such as web browsers, you’re not in control of the number of clients or what they might attempt to do. Here are some points to consider:
Alternatively, if you just want to accept connections from trusted clients and the privacy of the data is important:
Now you should have a good idea of how to secure an MQTT-based system and the features of an MQTT server to look for so that you can secure it properly. Why not give it a try?