blob: e943ff8d6ade86a07d2e6465b29a2b5a5a9d7e5f [file] [log] [blame] [view]
Andrew Walbranb2be7c62019-08-06 14:55:29 +01001# VM interface
2
3This page provides an overview of the interface Hafnium provides to VMs. Hafnium
4makes a distinction between the 'primary VM', which controls scheduling and has
5more direct access to some hardware, and 'secondary VMs' which exist mostly to
6provide services to the primary VM, and have a more paravirtualised interface.
7The intention is that the primary VM can run a mostly unmodified operating
8system (such as Linux) with the addition of a Hafnium driver, while secondary
9VMs will run more specialised trusted OSes or bare-metal code which is designed
10with Hafnium in mind.
11
12The interface documented here is what is planned for the first release of
13Hafnium, not necessarily what is currently implemented.
14
15## CPU scheduling
16
17The primary VM will have one vCPU for each physical CPU, and control the
18scheduling.
19
20Secondary VMs will have a configurable number of vCPUs, scheduled on arbitrary
21physical CPUs at the whims of the primary VM scheduler.
22
23All VMs will start with a single active vCPU. Subsequent vCPUs can be started
24through PSCI.
25
26## PSCI
27
28The primary VM will be able to control the physical CPUs through the following
29PSCI 1.1 calls, which will be forwarded to the underlying implementation in EL3:
30
31* PSCI_VERSION
32* PSCI_FEATURES
33* PSCI_SYSTEM_OFF
34* PSCI_SYSTEM_RESET
35* PSCI_AFFINITY_INFO
36* PSCI_CPU_SUSPEND
37* PSCI_CPU_OFF
38* PSCI_CPU_ON
39
40All other PSCI calls are unsupported.
41
42Secondary VMs will be able to control their vCPUs through the following PSCI 1.1
43calls, which will be implemented by Hafnium:
44
45* PSCI_VERSION
46* PSCI_FEATURES
47* PSCI_AFFINITY_INFO
48* PSCI_CPU_SUSPEND
49* PSCI_CPU_OFF
50* PSCI_CPU_ON
51
52All other PSCI calls are unsupported.
53
54## Hardware timers
55
56The primary VM will have access to both the physical and virtual EL1 timers
57through the usual control registers (`CNT[PV]_TVAL_EL0` and `CNT[PV]_CTL_EL0`).
58
59Secondary VMs will have access to the virtual timer only, which will be emulated
60with help from the kernel driver in the primary VM.
61
62## Interrupts
63
64The primary VM will have direct access to control the physical GIC, and receive
65all interrupts (other than anything already trapped by TrustZone). It will be
66responsible for forwarding any necessary interrupts to secondary VMs. The
67Interrupt Translation Service (ITS) will be disabled by Hafnium so that it
68cannot be used to circumvent access controls.
69
70Secondary VMs will have access to a simple paravirtualized interrupt controller
71through two hypercalls: one to enable or disable a given virtual interrupt ID,
72and one to get and acknowledge the next pending interrupt. There is no concept
73of interrupt priorities or a distinction between edge and level triggered
74interrupts. Secondary VMs may also inject interrupts into their own vCPUs.
75
76## Performance counters
77
78VMs will be blocked from accessing performance counter registers (for the
79performance monitor extensions described in chapter D5 of the ARMv8-A reference
80manual) in production, to prevent them from being used as a side channel to leak
81data between VMs.
82
83Hafnium may allow VMs to use them in debug builds.
84
85## Debug registers
86
87VMs will be blocked from accessing debug registers in production builds, to
88prevent them from being used to circumvent access controls.
89
90Hafnium may allow VMs to use these registers in debug builds.
91
92## RAS Extension registers
93
Fuad Tabba66476b32019-10-29 10:32:04 +000094Secondary VMs will be blocked from using registers associated with the RAS
95Extension.
Andrew Walbranb2be7c62019-08-06 14:55:29 +010096
97## Asynchronous message passing
98
99VMs will be able to send messages of up to 4 KiB to each other asynchronously,
100with no queueing, as specified by SPCI.
101
102## Memory
103
104VMs will statically be given access to mutually-exclusive regions of the
105physical address space at boot. This includes MMIO space for controlling
106devices, plus a fixed amount of RAM for secondaries, and all remaining address
107space to the primary. Note that this means that only one VM can control any
108given page of MMIO registers for a device.
109
110VMs may choose to donate or share their memory with other VMs at runtime. Any
111given page may be shared with at most 2 VMs at once (including the original
112owning VM). Memory which has been donated or shared may not be forcefully
113reclaimed, but the VM with which it was shared may choose to return it.
114
Fuad Tabba7a31b8d2019-10-28 15:17:27 +0000115## Cache
116
117VMs will be blocked from using cache maintenance instructions that operate by
118set/way. These operations are difficult to virtualize, and could expose the
119system to side-channel attacks.
120
Andrew Walbranb2be7c62019-08-06 14:55:29 +0100121## Logging
122
123VMs may send a character to a shared log by means of a hypercall or SMC call.
124These log messages will be buffered per VM to make complete lines, then output
125to a Hafnium-owned UART and saved in a shared ring buffer which may be extracted
126from RAM dumps. VM IDs will be prepended to these logs.
127
128This log API is intended for use in early bringup and low-level debugging. No
129sensitive data should be logged through it. Higher level logs can be sent to the
130primary VM through the asynchronous message passing mechanism described above,
131or through shared memory.
132
133## Configuration
134
135Hafnium will read configuration from a flattened device tree blob (FDT). This
136may either be the same device tree used for the other details of the system or a
137separate minimal one just for Hafnium. This will include at least:
138
139* The available RAM.
140* The number of secondary VMs, how many vCPUs each should have, how much
141 memory to assign to each of them, and where to load their initial images.
142 (Most likely the initial image will be a minimal loader supplied with
143 Hafnium which will validate and load the rest of the image from the primary
144 later on.)
145* Which devices exist on the system, their details (MMIO regions, interrupts
146 and SYSMMU details), and which VM each is assigned to.
147 * A single physical device may be split into multiple logical devices
148 from Hafniums point of view if necessary to have different VMs own
149 different parts of it.
150* A whitelist of which SMC calls each VM is allowed to make.
151
152## Failure handling
153
154If a secondary VM tries to do something it shouldn't, Hafnium will either inject
155a fault or kill it and inform the primary VM. The primary VM may choose to
156restart the system or to continue without the secondary VM.
157
158If the primary VM tries to do something it shouldn't, Hafnium will either inject
159a fault or restart the system.
160
161## TrustZone communication
162
163The primary VM will be able to communicate with a TEE running in TrustZone
164either through SPCI messages or through whitelisted SMC calls, and through
165shared memory.
166
167## Other SMC calls
168
169Other than the PSCI calls described above and those used to communicate with
170Hafnium, all other SMC calls will be blocked by default. Hafnium will allow SMC
171calls to be whitelisted on a per-VM, per-function ID basis, as part of the
172static configuration described above. These whitelisted SMC calls will be
173forwarded to the EL3 handler with the client ID (as described by the SMCCC) set
174to the calling VM's ID.