The size of the modi¬ed circuit: Step 0 of the protocol replaces the circuit

with a di¬erent one which has max(4n, 8s2 ) input wires. The statistical security

parameter s2 therefore a¬ects the size of the circuit, both in terms of the number

of wires and the number of gates. When n < 2s2 , as in our experiments, we have

8s2 new input wires. Each original input wire is replaced with the exclusive-or of

about 4s2 input wires, which can be computed using 4s2 ’ 1 gates. The circuit

therefore grows by about 4ns2 gates, which in our case translate to 2560 gates for

s2 = 40, and 3840 gates for s2 = 60. We managed to optimize this construction

Implementing Two-Party Computation 15

by using a variant of structured Gaussian elimination in order to reuse gates.

As a result, for the case of s2 = 40, the augmented circuit produced in Stage 0

has over one thousand gates and over one thousand ¬ve hundred internal wires.

If s2 is increased to 60 then the augmented circuit now has over one thousand

¬ve hundred gates and over two thousand internal wires. The exact increase in

size depends on the random choices made in Stage 0, but the above values are

indicative.

Implementation: The program was implemented in C++ using standard li-

braries; the elliptic curve routines made use of specially written assembly func-

tions to perform the arithmetic instructions. On the machine that was used for

the experiments, and the curve we were using, the software needed 3.9 millisec-

onds for a basic multiplication, 1.2 milliseconds to multiply the ¬xed generator,

and 5.1 milliseconds in order to compute (aP + bQ) (using a variant of the

method described in Algorithm 14.88 of [16]).

The input to the program was a circuit represented by a text ¬le, each line of

the text ¬le represented a gate. For example the line

2 1 0 16 32 0100

represents a 2-to-1 gate which has input wires numbered 0 and 16 and produces

the output wire 32. The value of the gate is given by the string which follows.

The above example implements a two-bit “less than” gate, namely it will output

a 1 on wire 32 only if w0 < w16 , i.e. the value of wire 0 is zero and the value of

wire 16 is one.

Experiments: We performed a set of experiments with di¬erent values of the

statistical security parameters s1 and s2 , and using both the ROM and standard

model versions of the protocol. The run times, in seconds, are presented in Table

1, and are reported for each step of the protocol. Timings are performed using

the standard Linux system timing facilities, and are as such only indicative.

The wall time is measured using the standard time function and the system and

user times are measured using the getrusage function. The wall time represents

the elapsed wall clock time in running the program, the user time represents

the amount of time each party actually performed some computation, whereas

the syst time represents the time spent by each party in system routines (for

example transmitting data, or writing to disk, etc.). All timings were performed

on an Intel Core 2 6420 running at 2.13 GHZ with 4096 KB of cache and 2 GB

of RAM and are given in seconds.

Basic observations: The computation is not instantaneous but overall the run

time is quite reasonable (the overall run time is about 2-3 minutes for a security

parameter s1 = 160). The run time is a¬ected, of course, by the fact that 160

copies of the circuit are being used in the computation (compared to a protocol

secure against semi-honest adversaries, which uses only a single copy of the

circuit), and the fact that each circuit is much larger than its original form (in

the experiment more than 1000 gates are added to the circuit in Step 0, where

the original circuit consisted of less than 20 gates).

16 Y. Lindell, B. Pinkas, and N.P. Smart

Oblivious transfers: It is a little surprising that Step 2, which includes the

oblivious transfers, is not the main bottleneck of the protocol. This is true even

though we implemented an OT protocol which is secure against malicious ad-

versaries according to a full simulation de¬nition.

Preprocessing: About half of the run time is consumed by Step 1, where P1

prepares the circuits and the commitments. This step can be run o¬„ine, before

the inputs are known, reducing the online run time by about 50%.

Scaling: Increasing s1 by a factor of c1 increases by a factor of c2 the number

1

of commitments generated by P1 in Step 1, and increases the number of circuits

by c1 . Increasing s2 by a factor of c2 increases the size of the modi¬ed part of

the circuit (which is the bulk of the circuit in our experiments) by a factor of

c2 , and therefore the total size of the circuits is increased by a factor of c1 c2 .

In the experiments, we increased both s1 and s2 by a factor of 1.5 (from 40 to

60, and from 160 to 240, respectively). We therefore expected the overhead to

increase by a factor of 2.25. The actual measurements showed an increase by a

factor slightly larger than 2.

We did not conduct experiments with circuits of di¬erent sizes. When all

other parameters are ¬xed, we expect the run time to be linear in the size of

the modi¬ed circuit (after the modi¬cations done in Step 0). We can estimate

the size of the modi¬ed circuit as follows: If P2 has n input wires in the original

circuit, then the modi¬ed circuit is expected to have about n max(4n, 8s2 ) more

2

gates. (Applying structured Gaussian elimination can enable us to reuse gates

and minimize the size of the modi¬ed circuit.)

Performance in the ROM and in the standard model: What is interest-

ing about the timings is that there is very little di¬erence between the timings

in the ROM and those in the standard model. In Step 1 the ROM version

is more e¬cient simply due to the slightly more e¬cient encryption scheme

used.7 Given the large number of encryptions needed to produce the garbled

circuit this translates into a small advantage for the ROM version compared

to the standard-model implementation. For a similar reason one obtains a per-

formance improvement in the ROM in Step 7 in which the circuit is evaluated

by P2 . The decrease in performance of the ROM compared to the standard

model in Step 3 we cannot explain, but it is likely to be caused by experimental

error.

In viewing the timings it should be born in mind that the main place that

the random oracle model is used is in the oblivious transfers in Step 2. At this

point we use the ROM to reduce the round complexity of the two required

zero-knowledge proofs (see Appendix A for details of this). However, these two

7

The KDF is invoked in the standard model protocol about twice as many times

as in the ROM protocol (since the encryption function in the standard model calls

the KDF twice). The increase in the run time of Step 1 when changing the ROM

implementation to the standard-model implementation (for s1 = 160) is from 60sec

to 67sec. We therefore estimate that the circuit construction (Step 1(a)) takes about

7 seconds in the ROM protocol and 14 seconds in the standard model protocol.

Implementing Two-Party Computation 17

proofs are only used once in the whole run of the protocol as we have batched

the oblivious transfers, and therefore the run time of Step 2 is about the same

in both the ROM and the standard model protocols.

What is surprising about the fact that the standard model is comparable in

performance to the ROM is that for simpler cryptographic functionalities, such

as encryption or signature schemes, the performance of the best ROM based

scheme is often twice as fast as the best known standard model scheme.

6 Future Work

An obvious task is to develop the current implementation into a complete system

for secure computation. In particular, the system should include a front end that

will enable users to provide a high-level speci¬cation of the function that they

want to compute, and specify the di¬erent security parameters that shall be used.

A natural approach for this task would be to modify the FairPlay compiler [15]

to support our implementation.

Table 1. Run times of our experiments

Run Times in the Random Oracle Run Times in the standard Model

Model

Step

Step

Time 1 2 3 4 5 6 7 8 Total

Time 1 2 3 4 5 6 7 8 Total

P1 , s1 = 160, s2 = 40

P1 , s1 = 160, s2 = 40

Wall 84 20 24 0 7 7 00 142

Wall 74 20 24 0 7 10 00 135

User 67 18 10 0 5 3 00

User 60 17 12 0 3 4 00

Syst 15 0 5 0 0 0 00

Syst 16 2 3 0 0 0 00

P2 , s1 = 160, s2 = 40

P2 , s1 = 160, s2 = 40

Wall 84 20 24 0 7 7 40 2 184

Wall 74 20 24 0 8 9 35 1 171

User 0 10 13 0 7 5 32 4

User 0 8 14 0 8 7 29 1

Syst 0 0 11 0 1 3 82

Syst 0 0 10 0 2 4 80

P1 , s1 = 240, s2 = 60

P1 , s1 = 240, s2 = 60