Some more:
A bare loop with no body
=set_yield(-1,-1) t0=get_tick_count() for i=0,100000 do end return get_tick_count() - t0
120
A loop with an empty lua function.
=local f=function() end set_yield(-1,-1) t0=get_tick_count() for i=0,100000 do f() end return get_tick_count() - t0
670
Interestingly, an empty C function takes only 480.
Math performance in C.
We have often avoided FP math because it's entirely in software, but I've never really had a feeling for how bad it may be. Integer division is also in software.
For this test, I made lua functions that add, multiply or divide two C globals and store the result in another global.
This is not really a good test, since it's mostly Lua<->C overhead, and load/store overhead is likely to be significant for integers too. To measure this, I also included and empty function and one that just does an assignment.
As one would expect, the results (especially for division an multiplication) depend on the actual values: Multiplying or dividing by 1 is fast.
Test code (after setting values to be operated on with test_seti and test_setf), using patch attached.
r={}
funcs={'nop','fdiv','fmul','fadd','fmov','idiv','imul','iadd','imov'}
set_yield(-1,-1)
for _,n in ipairs(funcs) do
local f=_G['test_'..n]
t0=get_tick_count()
for i=0,100000 do
f()
end
r[n] = get_tick_count()-t0
sleep(50)
end
return r
Values used in the operation a given below as i1,i2,f1 and f2.
Results
i1 = i2 = f1 = f2 = 1
nop=480
fmov=480
fmul=520
fdiv=510
fadd=520
imov=480
iadd=480
imul=480
idiv=500
From this we can see that overhead is ~480, and load / store is insignificant. nop and mov are omitted from the following.
i1=f1=60000
i2=f2=123
fadd=540
fmul=540
fdiv=880
iadd=480
imul=480
idiv=560
i1=f1=60000
i2=f2=12345
fadd=550
fmul=540
fdiv=880
iadd=480
imul=480
idiv=520
i1=f1=1700000
i2=f2=3
fadd=540
fdiv=880
fmul=540
iadd=480
imul=480
idiv=610
Subtracting the overhead of 480, the fdiv above works out to 4 microseconds or 250k/sec. Integer division is faster.
Bottom line: FP math or integer division can have significant impact if thousands of operations are involved, but it's not so slow it needs to be avoided at all costs.
Equivalent of the last one in Lua
=local a=1700000 local b=3 local z set_yield(-1,-1) t0=get_tick_count() for i=0,100000 do z=a+b end return get_tick_count() - t0
240
=local a=1700000 local b=3 local z set_yield(-1,-1) t0=get_tick_count() for i=0,100000 do z=a*b end return get_tick_count() - t0
240
=local a=1700000 local b=3 local z set_yield(-1,-1) t0=get_tick_count() for i=0,100000 do z=a/b end return get_tick_count() - t0
360
Given that ~120 is loop overhead, this is not bad.