perf(bl31): convert cpu_data fetching to C

The assembly routines are opaque to the compiler and it can't inline
them. There is also no requirement for them to be called without a
stack - each of their calls has a stack available. So convert them to C
so that the compiler can do its inlining magic.

On AArch32 we need to be able to call _cpu_data from the entrypoint so
it has to stay as a slight exception.

We can also straighten out the type of the cpu_ops_ptr member so we
don't have to cast it everywhere.

Change-Id: I9c2939a955b396edf26b99ef36318eebeaab13e6
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
diff --git a/lib/psci/psci_setup.c b/lib/psci/psci_setup.c
index 0863a82..44c9bdb 100644
--- a/lib/psci/psci_setup.c
+++ b/lib/psci/psci_setup.c
@@ -63,8 +63,7 @@
 		/* Initialize with an invalid mpidr */
 		psci_cpu_pd_nodes[node_idx].mpidr = PSCI_INVALID_MPIDR;
 
-		svc_cpu_data =
-			&(_cpu_data_by_index(node_idx)->psci_svc_cpu_data);
+		svc_cpu_data = &get_cpu_data_by_index(node_idx, psci_svc_cpu_data);
 
 		/* Set the Affinity Info for the cores as OFF */
 		svc_cpu_data->aff_info_state = AFF_STATE_OFF;