Magic Lantern's cache hacks on PowerShots? - page 2 - General Discussion and Assistance - CHDK Forum

Magic Lantern's cache hacks on PowerShots?

  • 13 Replies
  • 11604 Views
*

Offline srsa_4c

  • ******
  • 4451
Re: Magic Lantern's cache hacks on PowerShots?
« Reply #10 on: 22 / November / 2020, 13:45:50 »
Advertisements
I have finally taken some time to understand what the firmware's cache-related functions do (on ARM946e-based DIGICs). I added the identified functions - with (hopefully) descriptive names - to the DryOS sigfinder.
Cameras up to approx. r52 have functions that operate on both instruction cache and data cache (cache type is their first argument).
Code: [Select]
cache_flush_and_enable
cache_clean_flush_and_disable
cache_flush_range
cache_clean_flush_range
cache_clean_range

Newer models have separate functions for icache and dcache.
Code: [Select]
icache_flush_and_enable
icache_disable_and_flush
dcache_flush_and_enable
dcache_clean_flush_and_disable
dcache_flush_range
dcache_clean_range
dcache_clean_flush_range
icache_flush_range

The only thing that hinders instruction cache hacks is CHDK - it flushes the icache when it loads a module. So, I made the following change:
Code: [Select]
Index: include/cache.h
===================================================================
--- include/cache.h (revision 5635)
+++ include/cache.h (working copy)
@@ -7,7 +7,7 @@
 /*
 arm cache control
 */
-void icache_flush_all(void);
+void icache_flush_range(void *addr, unsigned int size);
 void dcache_clean_all(void);
 
 #endif
Index: lib/armutil/cache.c
===================================================================
--- lib/armutil/cache.c (revision 5635)
+++ lib/armutil/cache.c (working copy)
@@ -23,6 +23,11 @@
     );
 }
 
+/* flushing range only required on ARMv5, flushing all icache instead */
+void icache_flush_range(void *addr, unsigned int size) {
+    icache_flush_all();
+}
+
 /* Values for Ctype fields in CLIDR */
 #define ARMV7_CLIDR_CTYPE_NO_CACHE      0
 #define ARMV7_CLIDR_CTYPE_INSTRUCTION_ONLY  1
@@ -222,6 +227,15 @@
     return 0x200 << sz;
 }
 
+unsigned int is_dcache_locked() {
+    unsigned int statusd = 0;
+    asm volatile (
+       /* get lockdown status */\
+       "MRC p15, 0, %0, c9, c0, 0\n"
+       : "=r"(statusd) : );
+    return statusd & 0xff;
+}
+
 /*
 flush (mark as invalid) entire instruction cache
 */
@@ -234,6 +248,22 @@
 }
 
 /*
+flush address range from instruction cache - cache hack friendly
+*/
+void __attribute__((naked,noinline)) icache_flush_range(void *addr, unsigned int size) {
+    asm volatile (
+    "add    r1, r1, r0\n"
+    "bic    r0, r0, #0x1F\n"
+"ifr1:\n"
+    "mcr    p15, 0, r0, c7, c5, 1\n"
+    "add    r0, r0, #0x20\n"
+    "cmp    r0, r1\n"
+    "bcc    ifr1\n"
+    "bx     lr\n"
+    );
+}
+
+/*
 clean (write all dirty) entire data cache
 also drains write buffer (like canon code)
 does *not* flush
@@ -240,7 +270,11 @@
 */
 void __attribute__((naked,noinline)) dcache_clean_all(void) {
   asm volatile (
-    "PUSH   {LR}\n"
+    "PUSH   {R4,LR}\n"
+    "bl     is_dcache_locked\n" // check for "cache hacks" being used
+    "mov    r4, #0\n"
+    "cmp    r0, #0\n"
+    "movne  r4, #0x40000000\n" // avoid cleaning locked 1st segment
     "BL     cache_get_config\n"
     "BL     dcache_get_size\n"
     "CMP    r0, #0\n"
@@ -248,7 +282,7 @@
     // index limit (max index+1)
     // per ARM DDI 0201D 4kb = bits 9:5
     "LSR    r3, r0, #2\n"
-    "MOV    r1, #0\n"
+    "MOV    r1, r4\n"
 "2:\n"
     "MOV    r0, #0\n"
 "1:\n"
@@ -262,7 +296,7 @@
     "BNE    2b\n"
     "MCR    p15, 0, r1, c7, c10, 4\n" // drain write buffer
 "3:\n"
-    "POP    {LR}\n"
+    "POP    {R4,LR}\n"
     "BX     LR\n"
   );
 }
Index: modules/module_load.c
===================================================================
--- modules/module_load.c (revision 5635)
+++ modules/module_load.c (working copy)
@@ -573,7 +573,7 @@
     // clean data cache to ensure code is in main memory
     dcache_clean_all();
     // then flush instruction cache to ensure no addresses containing new code are cached
-    icache_flush_all();
+    icache_flush_range(flat_buf, flat.reloc_start);
 
     // Return module memory address
     return flat_buf;
With this mod, icache hacks seem stable (no periodic refresh required).
For dcache hacks, one or maybe more of the firmware routines need a similar mod like above. I have not tried dcache hacks with this mod, yet.

*

Offline reyalp

  • ******
  • 14126
Re: Magic Lantern's cache hacks on PowerShots?
« Reply #11 on: 22 / November / 2020, 15:37:04 »
Nice. FWIW, the only reason I didn't do range originally for modules was laziness.
Don't forget what the H stands for.

*

Offline srsa_4c

  • ******
  • 4451
Re: Magic Lantern's cache hacks on PowerShots?
« Reply #12 on: 23 / November / 2020, 14:10:39 »
FWIW, the only reason I didn't do range originally for modules was laziness.
Flushing a range has the downside of being a slower operation (cycling over the range doing 0x20 bytes per cycle). Do we have modules that are large and their loading is time-critical?

Tried a combined (icache+dcache) hack on ixus150 (without cyclic refresh of icache and dcache hacks), and it worked. The only fw routine that needed patching was dcache_clean_flush_range, to avoid flushing the first cache segment.
Demo picture shows a replacement get_string_by_id firmware function in action, returning the CHDK string for all requests.

*

Offline reyalp

  • ******
  • 14126
Re: Magic Lantern's cache hacks on PowerShots?
« Reply #13 on: 23 / November / 2020, 15:22:43 »
Flushing a range has the downside of being a slower operation (cycling over the range doing 0x20 bytes per cycle). Do we have modules that are large and their loading is time-critical?
IMO, the time required to load from SD will completely swamp the time to flush the cache, so it's unlikely to be worth worrying about. The largest module is Lua, which is ~100k of code.

Flushing unrelated stuff also has a cost, since it will need to be re-loaded from main RAM
Don't forget what the H stands for.

 

Related Topics


SimplePortal © 2008-2014, SimplePortal