I have now marked all parts of the image in red, where the RAW values are> 15000. As you can see, there is a large plateau there. Small EV changes ensure that the area is sometimes overexposed and sometimes not.
Yeah, that seems likely to be the explanation.
Maybe ev_change_max = 32 was too big.
IIRC, I didn't pick that as the default based on any real analysis, it was a guess and it seemed to work most of the time. The minimum does need to be big enough to keep up with the fastest sustained change, and obviously smaller will make it react more slowly to real changes.
That said, I think the script has always had a tendency to get into oscillations in cases like this, and probably needs different logic to handle it well. In fact, I see a comment
-- TODO
-- to avoid flapping as limits approached over / under weights
Perhaps a way to deal with this would be to account for the fraction near over exposure somehow or apply varying weight depending how deep into exp_over_margin it is, effectively dividing the "overexposed" bucket in the histogram into multiple buckets with different weight. Or perhaps the "smoothing" should give more weight to the longer term trend, but I think this problem is specific to how the histogram based limits work.
As always, adding more complexity will make it harder to test.
An idle thought: With something like the chdkptp remoteshoot glue, it would be possible to create a GUI that lets you visualize what the script is doing and change settings in real time.