Move carry propagation out of mpi_sub_hlp

The function mpi_sub_hlp had confusing semantics: although it took a
size parameter, it accessed the limb array d beyond this size, to
propagate the carry. This made the function difficult to understand
and analyze, with a potential buffer overflow if misused (not enough
room to propagate the carry).

Change the function so that it only performs the subtraction within
the specified number of limbs, and returns the carry.

Move the carry propagation out of mpi_sub_hlp and into its caller
mbedtls_mpi_sub_abs. This makes the code of subtraction very slightly
less neat, but not significantly different.

In the one other place where mpi_sub_hlp is used, namely mpi_montmul,
this is a net win because the carry is potentially sensitive data and
the function carefully arranges to not have to propagate it.

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
diff --git a/library/bignum.c b/library/bignum.c
index e59d91f..26784f9 100644
--- a/library/bignum.c
+++ b/library/bignum.c
@@ -1105,12 +1105,23 @@
 }
 
 /*
- * Helper for mbedtls_mpi subtraction:
- * d -= s where d and s have the same size and d >= s.
+ * Helper for mbedtls_mpi subtraction.
+ *
+ * Calculate d - s where d and s have the same size.
+ * This function operates modulo (2^ciL)^n and returns the carry
+ * (1 if there was a wraparound, i.e. if `d < s`, and 0 otherwise).
+ *
+ * \param n             Number of limbs of \p d and \p s.
+ * \param[in,out] d     On input, the left operand.
+ *                      On output, the result of the subtraction:
+ * \param[s]            The right operand.
+ *
+ * \return              1 if `d < s`.
+ *                      0 if `d >= s`.
  */
-static void mpi_sub_hlp( size_t n,
-                         mbedtls_mpi_uint *d,
-                         const mbedtls_mpi_uint *s )
+static mbedtls_mpi_uint mpi_sub_hlp( size_t n,
+                                     mbedtls_mpi_uint *d,
+                                     const mbedtls_mpi_uint *s )
 {
     size_t i;
     mbedtls_mpi_uint c, z;
@@ -1121,11 +1132,7 @@
         c = ( *d < *s ) + z; *d -= *s;
     }
 
-    while( c != 0 )
-    {
-        z = ( *d < c ); *d -= c;
-        c = z; i++; d++;
-    }
+    return( c );
 }
 
 /*
@@ -1136,6 +1143,7 @@
     mbedtls_mpi TB;
     int ret;
     size_t n;
+    mbedtls_mpi_uint c, z;
 
     if( mbedtls_mpi_cmp_abs( A, B ) < 0 )
         return( MBEDTLS_ERR_MPI_NEGATIVE_VALUE );
@@ -1162,7 +1170,12 @@
         if( B->p[n - 1] != 0 )
             break;
 
-    mpi_sub_hlp( n, X->p, B->p );
+    c = mpi_sub_hlp( n, X->p, B->p );
+    while( c != 0 )
+    {
+        z = ( X->p[n] < c ); X->p[n] -= c;
+        c = z; n++;
+    }
 
 cleanup:
 
@@ -1768,7 +1781,7 @@
      * timing attacks. */
     /* Set d to A + (2^biL)^n - N. */
     d[n] += 1;
-    mpi_sub_hlp( n, d, N->p );
+    d[n] -= mpi_sub_hlp( n, d, N->p );
     /* Now d - (2^biL)^n = A - N so d >= (2^biL)^n iff A >= N.
      * So we want to copy the result of the subtraction iff d->p[n] != 0.
      * Note that d->p[n] is either 0 or 1 since A - N <= N <= (2^biL)^n. */