The tool cloud-init is normally used for cloud VMs and on first boot.
This article shows how to manually invoke cloud-init on an on-premises machine (e.g., in a small network, home lab, …) even after the machine has already booted.
This article is a starting point. We’ll provide the necessary cloud-init data using a standard HTTP server. You can later add features like HTTPS and dynamic data if needed.
The Datasource
The datasource specifies what cloud-init should do.
Normally, cloud-init detects which cloud provider it’s running on and automatically selects the appropriate datasource. (For a list of supported cloud providers, see the datasources documentation.)
To use cloud-init in an on-premises environment, you have to use the NoCloud datasource.
The NoCloud datasource can get its data from various sources - in this article, we’ll use HTTP.
cloud-init will only accept this datasource if it provides at least two files:
user-data
- contains the actual things to do.meta-data
- normally contains information provided by the cloud provider.
For this article, use the following contents for the user-data
file:
#cloud-config
# Don't delete existing SSH host keys.
ssh_deletekeys: false
runcmd:
- echo "it worked!" > /tmp/example.txt
For the meta-data
file, we will use the following contents:
instance-id: my-instance-001
As far as I can tell, the value of instance-id
doesn’t matter here and can be the same for all machines.
Its primary purpose is to let cloud-init detect whether it has already run on the machine.
The HTTP Server
We’ll use Docker to host the HTTP server that serves the cloud-init files:
services:
cloud-init-http:
image: caddy:alpine # https://caddyserver.com/
container_name: http-server
ports:
- "8080:8080"
volumes:
- ./cloud-init:/srv/cloud-init:ro
- ./Caddyfile:/etc/caddy/Caddyfile:ro
- caddy_config:/config
- caddy_data:/data
restart: unless-stopped
volumes:
# See: https://hub.docker.com/_/caddy/#how-to-use-this-image
caddy_config:
caddy_data:
The Caddyfile
:
{
log {
format console
level ERROR
}
}
http://:8080 {
root * /srv
file_server
}
With this, you should have this file tree:
/
├── docker-compose.yml
├── Caddyfile
└── cloud-init/
├── meta-data
└── user-data
This setup serves the cloud-init files unencrypted over HTTP. This is just for demonstration purposes.
Since cloud-init lets you run arbitrary commands, an attacker could modify your user-data
in transit and take over your server.
In production, you should secure the server with HTTPS/TLS. Caddy has built-in support for ACME/Let’s Encrypt.
On the Target System
On the system you want to setup with cloud-init, you need to register your HTTP server as the datasource.
To do this, create the file /etc/cloud/cloud.cfg.d/99_datasource.cfg
:
datasource_list: ["NoCloud"]
datasource:
NoCloud:
seedfrom: http://<your-server>:8080/cloud-init/
You can use both an IP address or a DNS name for <your-server>
.
Next, test the connection:
$ curl http://<your-server>:8080/cloud-init/user-data
$ curl http://<your-server>:8080/cloud-init/meta-data
Testing the connection is very important. If cloud-init can’t reach your HTTP server, it will fall back to an empty datasource.
Running cloud-init with an empty datasource will re-create the machine’s SSH keys on every run because the ssh_deletekeys
instruction defaults to true
.
After that, you can invoke cloud-init with the following commands:
$ cloud-init clean --logs --machine-id --seed
$ cloud-init init
$ cloud-init modules --mode=config
$ cloud-init modules --mode=final
$ touch /etc/cloud/cloud-init.disabled
The first command (cloud-init clean
) resets cloud-init so it behaves as if it has never run.
The next three commands execute the three cloud-init stages: init
, config
, and final
.
The last command ensures cloud-init doesn’t run again the next time the system is rebooted.
To see which modules run at which stage, check the cloud_init_modules
, cloud_config_modules
, and cloud_final_modules
sections in /etc/cloud/cloud.cfg
.
Troubleshooting
If anything goes wrong or doesn’t work as expected, these commands can help you troubleshoot:
Print the datasource:
cloud-id
If you followed this article, the output should be nocloud
.
Print the status:
cloud-init status --long
Check the logs:
less /var/log/cloud-init.log
less /var/log/cloud-init-output.log