Camera crash on startup - investigation. - page 10 - General Discussion and Assistance - CHDK Forum

Camera crash on startup - investigation.

  • 112 Replies
  • 48418 Views
*

Offline srsa_4c

  • ******
  • 4451
Re: Camera crash on startup - investigation.
« Reply #90 on: 19 / March / 2014, 16:44:28 »
Advertisements
Can you try the exact same scenario without the open change (build 3376 or earlier)? I wouldn't be surprised if there was some other weirdness in config saving, possibly depending on whether you have exited alt mode before powering off.
r3366 has no problems. I forgot to mention, I'm always exiting ALT mode before shutting down the cam, and all my tries were done in play mode.

Re: Camera crash on startup - investigation.
« Reply #91 on: 19 / March / 2014, 16:51:58 »
When I started testing the _open/_write changes I noticed that some config settings were not being saved.
It's not just you, I'm getting this too (a3400, current trunk). Power on -> set user menu to [off] -> exit CHDK menu -> power off -> power on -> user menu still there.
During my MF tests with 3383, I've done the power cycle thing and come back to default config parameters.
Ported :   A1200    SD940   G10    Powershot N    G16

*

Offline reyalp

  • ******
  • 14082
Re: Camera crash on startup - investigation.
« Reply #92 on: 19 / March / 2014, 23:21:49 »
Ok, sounds like it's time to roll back this change. It would be interesting to know if open or write fails in the case where the CFG doesn't get updated.

edit:
r3366 has no problems. I forgot to mention, I'm always exiting ALT mode before shutting down the cam, and all my tries were done in play mode.
If you can reproduce this reliably, it might be worth doing some investigating
1) do any of the FS calls mentioned above fail
2) does replacing _Write with _write make it go away (though from Phil's earlier post, it seems like this isn't likely)
« Last Edit: 19 / March / 2014, 23:39:36 by reyalp »
Don't forget what the H stands for.

*

Offline reyalp

  • ******
  • 14082
Re: Camera crash on startup - investigation.
« Reply #93 on: 20 / March / 2014, 00:51:03 »
Here's an alternate approach that appears to avoid the startup crash for me: Take the same semaphore used by filewrite etc.

You need to find the semaphore I've called filesem, but it's fairly easy. The function that normally takes it is called from the function we hook for filewrite open (and many other easily identifiable file related locations)

This function ( sub_FF854354 on D10) does some other stuff too, but taking the semaphore directly seem to work OK in my limited test.
Don't forget what the H stands for.


*

Offline philmoz

  • *****
  • 3450
    • Photos
Re: Camera crash on startup - investigation.
« Reply #94 on: 20 / March / 2014, 05:10:40 »
Here's an alternate approach that appears to avoid the startup crash for me: Take the same semaphore used by filewrite etc.

You need to find the semaphore I've called filesem, but it's fairly easy. The function that normally takes it is called from the function we hook for filewrite open (and many other easily identifiable file related locations)

This function ( sub_FF854354 on D10) does some other stuff too, but taking the semaphore directly seem to work OK in my limited test.

Seems to work ok on the G12.

Should the code check for the return value from _TakeSemaphore?
In the G12 & G1X the filewrite code calls DebugAssert if the return value is 9.

Phil.

CHDK ports:
  sx30is (1.00c, 1.00h, 1.00l, 1.00n & 1.00p)
  g12 (1.00c, 1.00e, 1.00f & 1.00g)
  sx130is (1.01d & 1.01f)
  ixus310hs (1.00a & 1.01a)
  sx40hs (1.00d, 1.00g & 1.00i)
  g1x (1.00e, 1.00f & 1.00g)
  g5x (1.00c, 1.01a, 1.01b)
  g7x2 (1.01a, 1.01b, 1.10b)

*

Offline reyalp

  • ******
  • 14082
Re: Camera crash on startup - investigation.
« Reply #95 on: 20 / March / 2014, 16:22:07 »
Seems to work ok on the G12.

Should the code check for the return value from _TakeSemaphore?
Probably, this was just a quick and dirty POC

One possible way it might fail is if the CHDK code could be called before the semaphore was initialized.
Quote
In the G12 & G1X the filewrite code calls DebugAssert if the return value is 9.
I think this is a standard function that is used in a lot of places (I've called it TakeSemaphore_strict by analogy to CreateTaskStrictly etc). Not sure what 9 means.

edit: Actually, this is already in funcs_by*.csv as TakeSemaphoreStrictly
« Last Edit: 20 / March / 2014, 16:57:45 by reyalp »
Don't forget what the H stands for.

*

Offline srsa_4c

  • ******
  • 4451
Re: Camera crash on startup - investigation.
« Reply #96 on: 20 / March / 2014, 16:58:57 »
If you can reproduce this reliably, it might be worth doing some investigating
1) do any of the FS calls mentioned above fail
No.
Quote
2) does replacing _Write with _write make it go away
They're the same on this cam according to the sigfinder.
I now have the suspicion that the problem is due to caching. I could not reproduce it when I started logging into syslog and used a script to dump the log. I suspect that it takes some file system activity to flush that write cache. Close() might include code that flushes the cache.

edit:
Not sure what 9 means.
'1' is probably 'error', 8 might mean 'timeout'.
« Last Edit: 20 / March / 2014, 17:02:12 by srsa_4c »

*

Offline reyalp

  • ******
  • 14082
Re: Camera crash on startup - investigation.
« Reply #97 on: 20 / March / 2014, 17:22:35 »
They're the same on this cam according to the sigfinder.
Sorry, I meant the actual function _Write calls (first call, second one is fsionotify stuff), but this probably not worth pursuing.
Quote
I now have the suspicion that the problem is due to caching. I could not reproduce it when I started logging into syslog and used a script to dump the log. I suspect that it takes some file system activity to flush that write cache. Close() might include code that flushes the cache.
That would make sense, and might be a good reason not to prefer _open etc.
Quote
edit:
Not sure what 9 means.
'1' is probably 'error', 8 might mean 'timeout'.
It looks like 2 in the low level semaphore code gets turned into 9.
Don't forget what the H stands for.


*

Offline philmoz

  • *****
  • 3450
    • Photos
Re: Camera crash on startup - investigation.
« Reply #98 on: 20 / March / 2014, 19:47:04 »
Seems to work ok on the G12.

Should the code check for the return value from _TakeSemaphore?
Probably, this was just a quick and dirty POC

One possible way it might fail is if the CHDK code could be called before the semaphore was initialized.

I was more thinking what would happen if a task switch occurred in the middle of an _Open/_Close call (which is where the original problem comes from).

Does _TakeSemaphore wait for the semaphore to be available (which would solve our problem), or does it fail (which would just change behaviour of the problem)?

Quote
Quote
In the G12 & G1X the filewrite code calls DebugAssert if the return value is 9.
I think this is a standard function that is used in a lot of places (I've called it TakeSemaphore_strict by analogy to CreateTaskStrictly etc). Not sure what 9 means.

edit: Actually, this is already in funcs_by*.csv as TakeSemaphoreStrictly

I can add TakeSemaphoreStrictly to the stubs easily enough - do you think it would it be better to call this and risk a DebugAssert?

Phil.
CHDK ports:
  sx30is (1.00c, 1.00h, 1.00l, 1.00n & 1.00p)
  g12 (1.00c, 1.00e, 1.00f & 1.00g)
  sx130is (1.01d & 1.01f)
  ixus310hs (1.00a & 1.01a)
  sx40hs (1.00d, 1.00g & 1.00i)
  g1x (1.00e, 1.00f & 1.00g)
  g5x (1.00c, 1.01a, 1.01b)
  g7x2 (1.01a, 1.01b, 1.10b)

*

Offline philmoz

  • *****
  • 3450
    • Photos
Re: Camera crash on startup - investigation.
« Reply #99 on: 20 / March / 2014, 19:50:31 »
I now have the suspicion that the problem is due to caching. I could not reproduce it when I started logging into syslog and used a script to dump the log. I suspect that it takes some file system activity to flush that write cache. Close() might include code that flushes the cache.
That would make sense, and might be a good reason not to prefer _open etc.

I'm not sure this is the case.

On my cameras if I changed back to _Open/_Close then the problem went away.

But when I then restored the _open/_close patch it still worked correctly - at this stage I have not been able to reproduce the problem on any of my cameras.

IMO the TakeSemaphore approach is probably the best long term solution.

Phil.
CHDK ports:
  sx30is (1.00c, 1.00h, 1.00l, 1.00n & 1.00p)
  g12 (1.00c, 1.00e, 1.00f & 1.00g)
  sx130is (1.01d & 1.01f)
  ixus310hs (1.00a & 1.01a)
  sx40hs (1.00d, 1.00g & 1.00i)
  g1x (1.00e, 1.00f & 1.00g)
  g5x (1.00c, 1.01a, 1.01b)
  g7x2 (1.01a, 1.01b, 1.10b)

 

Related Topics