Blame - docs/VmInterface.md - hafnium/hafnium.git

blob: bc66aa46ad5b561c50833a2f8888e500e08b65af [file] [log] [blame] [view]

Andrew Walbran	b2be7c6	2019-08-06 14:55:29 +0100	[diff] [blame]	1	# VM interface
				2
				3	This page provides an overview of the interface Hafnium provides to VMs. Hafnium
				4	makes a distinction between the 'primary VM', which controls scheduling and has
				5	more direct access to some hardware, and 'secondary VMs' which exist mostly to
				6	provide services to the primary VM, and have a more paravirtualised interface.
				7	The intention is that the primary VM can run a mostly unmodified operating
Andrew Walbran	6e524d7	2019-11-12 17:36:57 +0000	[diff] [blame]	8	system (such as Linux) with the addition of a Hafnium driver which
				9	[fulfils certain expectations](SchedulerExpectations.md), while secondary VMs
				10	will run more specialised trusted OSes or bare-metal code which is designed with
				11	Hafnium in mind.
Andrew Walbran	b2be7c6	2019-08-06 14:55:29 +0100	[diff] [blame]	12
				13	The interface documented here is what is planned for the first release of
				14	Hafnium, not necessarily what is currently implemented.
				15
Andrew Walbran	b784997	2019-11-15 15:23:43 +0000	[diff] [blame]	16	[TOC]
				17
Andrew Walbran	b2be7c6	2019-08-06 14:55:29 +0100	[diff] [blame]	18	## CPU scheduling
				19
				20	The primary VM will have one vCPU for each physical CPU, and control the
				21	scheduling.
				22
				23	Secondary VMs will have a configurable number of vCPUs, scheduled on arbitrary
				24	physical CPUs at the whims of the primary VM scheduler.
				25
				26	All VMs will start with a single active vCPU. Subsequent vCPUs can be started
				27	through PSCI.
				28
				29	## PSCI
				30
				31	The primary VM will be able to control the physical CPUs through the following
				32	PSCI 1.1 calls, which will be forwarded to the underlying implementation in EL3:
				33
				34	* PSCI_VERSION
				35	* PSCI_FEATURES
				36	* PSCI_SYSTEM_OFF
				37	* PSCI_SYSTEM_RESET
				38	* PSCI_AFFINITY_INFO
				39	* PSCI_CPU_SUSPEND
				40	* PSCI_CPU_OFF
				41	* PSCI_CPU_ON
				42
				43	All other PSCI calls are unsupported.
				44
				45	Secondary VMs will be able to control their vCPUs through the following PSCI 1.1
				46	calls, which will be implemented by Hafnium:
				47
				48	* PSCI_VERSION
				49	* PSCI_FEATURES
				50	* PSCI_AFFINITY_INFO
				51	* PSCI_CPU_SUSPEND
				52	* PSCI_CPU_OFF
				53	* PSCI_CPU_ON
				54
				55	All other PSCI calls are unsupported.
				56
				57	## Hardware timers
				58
				59	The primary VM will have access to both the physical and virtual EL1 timers
				60	through the usual control registers (`CNT[PV]_TVAL_EL0` and `CNT[PV]_CTL_EL0`).
				61
				62	Secondary VMs will have access to the virtual timer only, which will be emulated
				63	with help from the kernel driver in the primary VM.
				64
				65	## Interrupts
				66
				67	The primary VM will have direct access to control the physical GIC, and receive
				68	all interrupts (other than anything already trapped by TrustZone). It will be
				69	responsible for forwarding any necessary interrupts to secondary VMs. The
				70	Interrupt Translation Service (ITS) will be disabled by Hafnium so that it
				71	cannot be used to circumvent access controls.
				72
				73	Secondary VMs will have access to a simple paravirtualized interrupt controller
				74	through two hypercalls: one to enable or disable a given virtual interrupt ID,
				75	and one to get and acknowledge the next pending interrupt. There is no concept
				76	of interrupt priorities or a distinction between edge and level triggered
				77	interrupts. Secondary VMs may also inject interrupts into their own vCPUs.
				78
				79	## Performance counters
				80
				81	VMs will be blocked from accessing performance counter registers (for the
				82	performance monitor extensions described in chapter D5 of the ARMv8-A reference
				83	manual) in production, to prevent them from being used as a side channel to leak
				84	data between VMs.
				85
				86	Hafnium may allow VMs to use them in debug builds.
				87
				88	## Debug registers
				89
				90	VMs will be blocked from accessing debug registers in production builds, to
				91	prevent them from being used to circumvent access controls.
				92
				93	Hafnium may allow VMs to use these registers in debug builds.
				94
				95	## RAS Extension registers
				96
Fuad Tabba	66476b3	2019-10-29 10:32:04 +0000	[diff] [blame]	97	Secondary VMs will be blocked from using registers associated with the RAS
				98	Extension.
Andrew Walbran	b2be7c6	2019-08-06 14:55:29 +0100	[diff] [blame]	99
				100	## Asynchronous message passing
				101
				102	VMs will be able to send messages of up to 4 KiB to each other asynchronously,
				103	with no queueing, as specified by SPCI.
				104
				105	## Memory
				106
				107	VMs will statically be given access to mutually-exclusive regions of the
				108	physical address space at boot. This includes MMIO space for controlling
				109	devices, plus a fixed amount of RAM for secondaries, and all remaining address
				110	space to the primary. Note that this means that only one VM can control any
				111	given page of MMIO registers for a device.
				112
				113	VMs may choose to donate or share their memory with other VMs at runtime. Any
				114	given page may be shared with at most 2 VMs at once (including the original
				115	owning VM). Memory which has been donated or shared may not be forcefully
				116	reclaimed, but the VM with which it was shared may choose to return it.
				117
Fuad Tabba	7a31b8d	2019-10-28 15:17:27 +0000	[diff] [blame]	118	## Cache
				119
				120	VMs will be blocked from using cache maintenance instructions that operate by
				121	set/way. These operations are difficult to virtualize, and could expose the
				122	system to side-channel attacks.
				123
Andrew Walbran	b2be7c6	2019-08-06 14:55:29 +0100	[diff] [blame]	124	## Logging
				125
				126	VMs may send a character to a shared log by means of a hypercall or SMC call.
				127	These log messages will be buffered per VM to make complete lines, then output
				128	to a Hafnium-owned UART and saved in a shared ring buffer which may be extracted
				129	from RAM dumps. VM IDs will be prepended to these logs.
				130
				131	This log API is intended for use in early bringup and low-level debugging. No
				132	sensitive data should be logged through it. Higher level logs can be sent to the
				133	primary VM through the asynchronous message passing mechanism described above,
				134	or through shared memory.
				135
				136	## Configuration
				137
				138	Hafnium will read configuration from a flattened device tree blob (FDT). This
				139	may either be the same device tree used for the other details of the system or a
				140	separate minimal one just for Hafnium. This will include at least:
				141
				142	* The available RAM.
				143	* The number of secondary VMs, how many vCPUs each should have, how much
				144	memory to assign to each of them, and where to load their initial images.
				145	(Most likely the initial image will be a minimal loader supplied with
				146	Hafnium which will validate and load the rest of the image from the primary
				147	later on.)
				148	* Which devices exist on the system, their details (MMIO regions, interrupts
				149	and SYSMMU details), and which VM each is assigned to.
				150	* A single physical device may be split into multiple logical ‘devices’
				151	from Hafnium’s point of view if necessary to have different VMs own
				152	different parts of it.
				153	* A whitelist of which SMC calls each VM is allowed to make.
				154
				155	## Failure handling
				156
				157	If a secondary VM tries to do something it shouldn't, Hafnium will either inject
				158	a fault or kill it and inform the primary VM. The primary VM may choose to
				159	restart the system or to continue without the secondary VM.
				160
				161	If the primary VM tries to do something it shouldn't, Hafnium will either inject
				162	a fault or restart the system.
				163
				164	## TrustZone communication
				165
				166	The primary VM will be able to communicate with a TEE running in TrustZone
				167	either through SPCI messages or through whitelisted SMC calls, and through
				168	shared memory.
				169
				170	## Other SMC calls
				171
				172	Other than the PSCI calls described above and those used to communicate with
				173	Hafnium, all other SMC calls will be blocked by default. Hafnium will allow SMC
				174	calls to be whitelisted on a per-VM, per-function ID basis, as part of the
				175	static configuration described above. These whitelisted SMC calls will be
				176	forwarded to the EL3 handler with the client ID (as described by the SMCCC) set
				177	to the calling VM's ID.