I moved this to the PTP interface thread, since it's was more of a protocol / camera side issue than a chdkptp client issue.
In r4798, I committed the following:
ptp - updates to PTP_CHDK_GetMemory, based on discussion in
https://chdk.setepontos.com/index.php?topic=6231.msg132332#msg132332* Automatically work around Canon data->send_data failing on NULL source, by sending any bytes in first word buffered if source address is < 4
* Update to use data->send_data instead of send_ptp_data. Better performance, same strategy already used for live view, remote shoot
* Update protocol to 2.8: PTP_CHDK_GetMemory input param4 is option which defines whether transfer should be buffered or unbuffered, independent of address 0 workaround. Default (0) is unbuffered, compatible with previous versions
I went through a few different approaches before I settled on this.
On my cameras, the problem with NULL really seems to be only that the Canon function treats NULL as an error. So I made the default GetMemory function just detect that and transfer any bytes in the first word separately. (First word rather than first byte because of
https://chdk.setepontos.com/index.php?topic=13101.0)
This means that existing clients should now be able to do RAM dumps from address zero, although it's possible some will still fail because due to the first 4 or 16 kb being TCM.
I also changed the unbuffered GetMemory to use a single send_data call, unlike the original version which sent in buffer_size chunks despite not being buffered. This is noticeably faster (33 MB/s vs 28 MB/s on D10) I believe the original logic was based on the idea that the firmware might try to allocate internal buffers for the whole transfer, but this is clearly not true, because live view and remote capture send all at once.
Separate from this, I added the mode option to PTP_CHDK_GetMemory to do the entire transfer buffered. This requires client support, chdkptp r741 or later
It currently uses memcpy. Originally, I was going to have another option for word-by-word copy, but in my testing memcpy worked fine for MMIO addresses. If it turns out word-by-word is needed, it's easy to add. Note that many MMIOs are write only, and reading them may have side effects, so dumping the whole MMIO space may not be a great idea in any case.
Buffered is noticeably slower than unbuffered, ~17 vs 35 MB/s for G7X. Oddly, D10 suffers less of a penalty (18 vs 33)
While mentioning speed
On a540 cached addresses are much slower than uncached, 5 vs 15 MB/s. ROM is in between at 12. I these old cameras buffer internally if the address isn't in uncached RAM. This was noted long ago.
Interestingly on newer cameras (d10 onward) there seems to be little difference between cached an uncached. ROM is somewhat slower than RAM (26 MB/s on G7x, 13 on D10)