Cached/Uncached memory Vs Read/Write: I tried something interesting ! - General Discussion and Assistance - CHDK Forum

Cached/Uncached memory Vs Read/Write: I tried something interesting !

  • 8 Replies
  • 7684 Views
*

Offline Lebeau

  • ***
  • 187
Advertisements
I made a simple deduction and try it my conclusion.

Since we can set raw data memory to be cached and writing these data into disk using:
-   write(fd, (char*)(((unsigned long)rawadr)|CAM_UNCACHED_BIT), hook_raw_size());

I suppose that reading and writing any cached-allocated data would work correctly when
-   ORing with CAM_UNCACHED_BIT

I made a try and it works for my A650.

Therefore, any memory manipulations will use faster cached memory, except for reading/writing from/to file.

Result from my A650: (DNG format, patch bad pixels, ...)
 - Uncached raw data : about 7 secs
 - Cached raw data: about 5 secs
 - Everything cached but uncached read/write: about 4 secs.

That's 20% faster for me !!!
using
in camera.h
 - #define UNCACHED_ADDR(buf)   ( ( char * ) ( ( ( unsigned long ) (buf) ) | CAM_UNCACHED_BIT ) )

 - read/write ( fa, UNCACHED_ADDR(buf), buf_size );
 - fread/fwrite ( UNCACHED_ADDR(buf), 1, BUF_SIZE, fa );


Is it portable?
« Last Edit: 19 / June / 2011, 21:43:31 by Lebeau »

*

Offline reyalp

  • ******
  • 12443
No, this is unsafe unless you take steps to ensure coherency. Just because it seems to work in your particular case does not mean it is correct or safe.

Example
Code: [Select]
some_var = 1;
write(fd,&some_var | CAM_UNACHED_BIT...
There is no guarantee the assignment will have been flushed to RAM by the time the write happens. The same thing can happen if you read to the uncached address and then try to do something with value using the cached address.

This is not some abstract theoretical point, a number of bugs have been caused by exactly this. Edit: and these bugs are a royal pain to track down because it's quite unpredictable...

You might ask why doing this in the DNG code is OK. It isn't really, but there are mitigating circumstances:
- When reverse bytes goes to read the raw data, those addresses are probably not in cache, since all the canon Code works on the uncached address.
- The raw buffer it is all likely flushed out by the time it is written to SD. We reverse all the bytes, then we write the entire buffer from the start (which was reversed a long time ago).
- The entire data cache is only 4k. On a 10mp camera, that's less than one row, which will probably be border pixels anyway.  Even if the first or last 4k contained garbage, it would make little difference.
Quote
Is it portable?
It's equally wrong everywhere, so yeah, I guess it's portable ;)
Don't forget what the H stands for.

*

Offline Lebeau

  • ***
  • 187
Well, after some tries, it doesn't work as expected.

Setting uncached bit don't work. I look to clean and flush cache (in asm) previously to write but choose to use cached malloc during process and move that mem into freshly uncached malloc buffer, then write that umalloc before ufreeing it.

Faster, really faster :)

*

Offline reyalp

  • ******
  • 12443
Well, after some tries, it doesn't work as expected.

Setting uncached bit don't work. I look to clean and flush cache (in asm) previously to write but choose to use cached malloc during process and move that mem into freshly uncached malloc buffer, then write that umalloc before ufreeing it.

Faster, really faster :)
If you use fopen etc, the canon OS takes care of doing this for you, at the expense of a 32k uncached buffer per FILE *.
Don't forget what the H stands for.


*

Offline Lebeau

  • ***
  • 187
I happy to ear that. I am not use to "file" and "ffile" command differences.

I will revisit the code but, for the dng header, I use "write" so ...

In "ARM Tech Ref Man - DDI0201D_arm946es_r1p1_trm.pdf", at page 3-11, this code to clean and flush data cache.
Code: [Select]
It is usual to clean the cache before flushing it, so that external memory is updated with
any dirty data. The following code segment shows how you can clean and flush the
entire cache (assuming a 4KB data cache):

MOV r1, #0 ; Initialize segment counter [color=red]outer_loop[/color]
MOV r0, #0 ; Initialize line counter [color=red]inner_loop[/color]
ORR r2, r1, r0 ; Generate segment and line address
MCR p15, 0, r2, c7, c14, 2 ; Clean and flush the line
ADD r0, r0, #0x20 ; Increment to next line
CMP r0, #0x400 ; Complete all entries in one segment?
BNE inner_loop ; If not branch back to inner_loop
ADD r1, r1, #0x40000000 ; Increment segment counter
CMP r1, #0x0 ; Complete all segments
BNE outer_loop ; If not branch back to outer_loop
; End of routine

Could that code be used in ASM(...) to clean and flush data cache before writing cached dng header buffer and raw buffer ?
I assume it's faster than copying 66KB from cached to uncached buffer !


*

Offline reyalp

  • ******
  • 12443
So no answer ?
Try it ?

You may also need to drain the write buffer.
Quote
I assume it's faster than copying 66KB from cached to uncached buffer !
First step is to identify what parts of process take significant time. No point saving a tenth of a millisecond when you spend 5 seconds writing to the SD card no matter what.

I don't think you will be able to measure how long this takes to copy 66kb without a lot of iterations. But that's a just a guess.
Don't forget what the H stands for.

*

Offline Lebeau

  • ***
  • 187
p2-24 of the same document "Drain write buffer"
----------------------------
Drain write buffer
This operation stalls instruction execution until the write buffer is emptied. This is
useful in real-time applications where the processor must be sure that a write to a
peripheral has completed before program execution continues. An example is where a
peripheral in a bufferable region is the source of an interrupt. When the interrupt has
been serviced, the request must be removed before interrupts can be re-enabled. This is
ensured if a drain write buffer operation separates the store to the peripheral and the
enable interrupt functions.
The drain write buffer operation is invoked by a write to register 7 using the following
ARM instruction:

MCR p15, 0, Rd, c7, c10, 4; drain write buffer

This stalls the processor core until any outstanding accesses in the write buffer are
completed, that is, until all data is written to external memory.
----------------------------

Not so sure that drain write buffer is a necessity, upon that info, since we don't have bidirectional communication between SD card and CPU. Am I right?


*

Offline Lebeau

  • ***
  • 187
I just try the previous code and it doesn't seam to work at all. :(

 

Related Topics