CHDKPTP - PC Remote Control Performance Analysis - page 23 - RAW Shooting and Processing - CHDK Forum supplierdeeply

CHDKPTP - PC Remote Control Performance Analysis

  • 465 Replies
  • 95702 Views
*

Offline reyalp

  • ******
  • 13673
Re: CHDKPTP - PC Remote Control Performance Analysis
« Reply #220 on: 18 / September / 2012, 17:42:17 »
Advertisements
Quote
I think you may disregard the forge request.  I downloaded from the stable area version S90 100c 2162.  The date stamp looks fine from today and powers up OK // only please confirm that it is the equivalent version to 101a 2142 test-1??
The 100c autobuild should have everything in your the 101a test build. However, my ability to debug romlogs will be limited if you aren't using a build I built, because I won't be able to match addresses in the stack trace to specific CHDK code. So I'll post a build when I get a chance.

Also, we are currently missing a good S90 100C firmware dump (see also this post: http://chdk.setepontos.com/index.php?topic=8195.msg86170#msg86170 ). If you can make a new one, that would be extremely helpful for CHDK development. You can use the Canon Basic dumper here http://chdk.wikia.com/wiki/Canon_Basic/Scripts/Dumper or wait until I can post a build and do it using lua and native calls.

Is there an industrial-strength CHDK+ slowly rising over the horizon?  I have no doubt and am definitely looking forward to migrate the S90 & CHDK+ to my instrumentation.
I have no idea what this "CHDK+" you keep talking about is. There is only CHDK, and it's a dirty hack. If you expect industrial reliability from CHDK, you are setting yourself up for disappointment.

I'm serious about this. If you want to use CHDK for your project, that's great but you should have realistic expectations. CHDK will crash or otherwise misbehave for unexplained reasons, and there's absolutely no guarantee anyone will be willing or able to debug it for you.

Failure is not an option... It comes standard in every build.

Please remember this post when your misguided optimism inevitably bites you in the [admin: avoid swearing please].

Case in point:
Your ReadFDir romlogs appear to be an out of memory error. I'm pretty certain this isn't connected to file save timing, or anything related to SD card timing characteristics.

The assert call in question is at FFA7D6D8
The condition for this assert is the return value of sub_FF838DCC != 0.
sub_FF838DCC is AllocateUncacheableMemory

I don't have time right now to dig back through the stack dump.

Also, the two romlogs are identical. Given that the time is equal to the second, and the millisecond timestamps in the camera log are equal, there is no possibility they are from two distinct crashes. If this isn't due to attaching the same file by mistake, then a least one of your crashes did not generate a romlog. The romlog sits around in camera flash until it's overwritten, so if you use the romlog menu multiple times, it will just give you the same one over. The date is 9/17, so it's not something really old.
Don't forget what the H stands for.

Re: CHDKPTP - PC Remote Control Performance Analysis
« Reply #221 on: 18 / September / 2012, 19:24:39 »
I am wondering why you'd ask that. 
Thought about the question for a bit over that last few days. Why are we so curious ?

Besides natural human nature,  and at the risk speaking for microfunguy (who has been here much longer than I),  it seems that many people post on this forum,  ask for and receive help, and present grand plans.  Once they get the help,  be it a custom script, usage advice, or customized firmware - they take that help, disappear and contribute nothing back to the group.   Most times not even acknowledging whether the solution they were given actually worked or not.

After a while it's not hard to get jaded.   

I don't know you so this is perhaps unfair,  but recent history suggests that once you have what you want,  we will never hear from you again.  Despite the huge volume of posts so far.

I'm not trying to start a flame war here - I'm just answering your question as it intrigued me.  Please feel free to prove me wrong. 



Ported :   A1200    SD940   G10    Powershot N    G16

*

Offline SticK

  • *****
  • 779
Re: CHDKPTP - PC Remote Control Performance Analysis
« Reply #222 on: 18 / September / 2012, 19:47:54 »
Quote "Also, the two romlogs are identical. "

You're right.  In my haste I did not validate the internal datestamps in my editor.  However, I just checked file properties and the files were downloaded 7 minutes apart last night // that's definite, and the 2nd one was downloaded after the 2nd fail.  So the only conclusion I can draw is that the 2nd log was not recorded by the camera.  If it happens again I will be much more vigilant and careful to get a comprehensive snapshot.

Quote "So I'll post a build when I get a chance."

I'll wait.  In any case for next day at least it's electooptics on the new camera that I'm examining.

Quote "we are currently missing a good S90 100C firmware ... or wait until I can post a build and do it using lua and native calls."

Yes, I'll be happy to get you a dump as long as you give me code and easy commands needed.

Quote "I have no idea what this "CHDK+" you keep talking about is"

CHDK+ is shorthand for CHDK+CHDKPTP // faster to write.

Quote "Your ReadFDir romlogs appear to be an out of memory error. I'm pretty certain this isn't connected to file save timing, or anything related to SD card timing characteristics."

We all know software errors can masquerade in an myriad of seemingly disconnected ways.  In my humble opinion from what I am experiencing here, this is an asynchronous resource contention issue.  There is a small group of fail modes that the dumps report and occur during listdir.  My rationale is this: why does failure frequency decrease with increased post-shoot delay?  It's because you're pushing the collision window to where the probability it happening is increasingly less.  From this I can visualize these two possible contributors:
    (a) the Canon 15 MB write is always done completely before shoot returns and there is always an asynchronous contention that is Canon-induced during listdir while listdir should not be disturbed (cannot be solved obviously)
         -- or --
    (b) the Canon 15 MB write is incomplete when shoot returns and contention is CHDK-induced by listdir while Canon should not be disturbed (is solved either by moving the shoot post-delay further out, or, end-of-write detection).

There may be other possibilities but of these two obvious ones, I favor (b).  It agrees with your description of the return nature of the shoot command with Canon files.  It agrees very handsomely with the histogram I pointed you to, but more, the fact that I was able to deduce this possibility with very little prior knowledge of SC card reorganization behavior and no prior knowledge of write time distribution: my hunch led me to that first forum and the histogram confirms my original ignorant interpretation.  Do you have some opinion about the histogram?  In addition, if this really were an out-of-memory error, you most likely would not get the back-to-back failures that occurred twice with delay time of 1.4s+, especially immediately after a hand-clean of directories and cold restart.  Furthermore, if this were a real out-of-memory, they would occur pretty much every time.  Even setting delay very short (300 ms) they don't occur every time.

My 1.6s test was started last night and it's still going // 3900+ JPG & CanonRAW files as I write.  A reasonably conclusive result that supports (b) would be no failure for another 24 hours.  If you've understood the statistics, a single failure at 1.6s in one week would still support (b).  No failure at all over a month (stopped by me) would imply that a solid threshold has been reached for this SD card, and it is safe to use.

I have two S90s now.  Try coding write-detect into shoot, and I can let one of them run for a week, or whatever time we want.  I am guessing it must be much easier than writing (possibly monstrous) code for direct USB transfer, although once srsa is done, I'd be happy to test it and use it hopefully.

*

Offline SticK

  • *****
  • 779
Re: CHDKPTP - PC Remote Control Performance Analysis
« Reply #223 on: 18 / September / 2012, 22:06:50 »
@waterwingz

Quote "Can you give the citations  for the Journals where the results have been presented ?"

Yes there is one, but not here simply because it will blow my alias away in a public forum // I am wondering why you'd ask that.

I suppose you are referring to this.   I think your interest and microfunguy's interests are different.  I understood microfunguy as being interested in my research.  I understand your interest is more in not being jaded.  Perhaps a better word would have been for me to have chosen "surprised" rather than "wondering."  I am certain he understood my position despite a poor word choice.  I am quite sure he also knows that even in private I can disclose very little until a manuscript has been accepted.  In fact, anyone who comes to see my lab has to sign an NDA and I don't give out or send out any information, as is the case for most researchers in the middle of work.

To address your concerns more specifically, please refer to my last paragraph here:
http://chdk.setepontos.com/index.php?topic=8613.msg90948#msg90948

I add that I have not presented any grand plans to anyone, except the *hope* the S90+CHDK will work with my instrumentation.  Let us remember this .... there are four phases of acceptance:  Phase 1 - basic standalone camera operation acceptance; Phase 2 - basic S90+CHDK instrument interoperability and compatibility;  Phase 3 - hardware modification and introduction into the equipment;  Phase 4 - actual in-place functional acceptance.  I am only in Phase 1 at present, and it's looking significantly better now than just a few weeks ago.  But, this endeavor is *very* far, months, from the end of Phase 4.

I cannot prove your feelings wrong or right about something like this // they are what they are from your personal experiences.

But I think to help put you more at ease with me, this would be very important to understand.  Phase 2 will likely pass, and CHDK+CHDKPTP are needed for that.  However, Phases 3 & 4 are a major uphill battle for me only and do not depend on CHDK+CHDKPTP.  There are chances the system won't make it past Phase 3, and we have discussed this at least once before.  I don't program lua or reverse engineer ARM code, so I cannot help the CHDK community there.  But one place where I can help, is as I expressed in my last paragraph in the link.  I think reyalp must be on the same page as me otherwise he would not be involved.  I know his mission is the community first, to me personally last, and I am very aware of that at every step.  Expressed again: I don't ask for anything that I feel cannot be used by the community, and reyalp always has last word: he can refuse or accept // it's simple.  So please understand that in case I do have a failure of acceptance in Phases 3 or 4 halting this endeavor in its tracks, it would be totally unrelated to the improvements we made to CHDK.  But if I have that failure and cannot use the new subsystem, I have to feel that I have already made some decent contributions not because you may feel jaded, but because I would feel unfair to the CHDK developers and your community.  Likewise, in contrast to my last paragraph, I say for a third time: there are features specific to this project I would very much like to have but am not permitted to ask for.  I gave you one example already.

Please do not misconstrue this as "grand plans:" in the hopeful event (risk) that all works out and future investigations (not the upcoming publication which has been 4 years in the making and has its necessary data) that will use the new subsystem, and there will be new discoveries worth publication, then a CHDK citation with one or two named principals will be merited, there is no question. 

Hence the implicit agreement between the CHDK developers and myself is the one of my link above.  I hope that helps.



*

Offline reyalp

  • ******
  • 13673
Re: CHDKPTP - PC Remote Control Performance Analysis
« Reply #224 on: 19 / September / 2012, 00:23:56 »
Quote "Your ReadFDir romlogs appear to be an out of memory error. I'm pretty certain this isn't connected to file save timing, or anything related to SD card timing characteristics."

We all know software errors can masquerade in an myriad of seemingly disconnected ways.  In my humble opinion from what I am experiencing here, this is an asynchronous resource contention issue.  There is a small group of fail modes that the dumps report and occur during listdir.  My rationale is this: why does failure frequency decrease with increased post-shoot delay?  It's because you're pushing the collision window to where the probability it happening is increasingly less.  From this I can visualize these two possible contributors:
You are assuming there is only one type of failure going on. This is not supported by the evidence. As I said before, the FsIoNotify 451 assert is well understood, it happens when too many file handles are open on the camera. So a reasonable theory is that it happens when the camera hasn't finished writing the jpeg (note however even this isn't proven, there could be something else triggered by the shot that involves filehandles, like updating the CANONMSC directory or some task re-scanning DCIM)

The immediate cause of the ReadFDir.c Line 335 from the Sep 17 romlog is also clear. From disassembly, the firmware is essentially doing
Code: [Select]
p=malloc(0x8000);
assert(p!=0);
While it isn't uncommon for code to return an error like "out of memory" for unrelated errors, that is not what is happening here.

I'm very confident in this conclusion. If you point out specific reasons my analysis of that bit of code is wrong, I will certainly reconsider, but I'm not going to have much patience for speculation that just ignores it.

Why malloc (actually it's very close cousin AllocateUncachableMemory) failed is not clear, but given that you had done almost 2500 shots, it's quite likely that it just couldn't allocate the requested amount of memory. This *could* still be somewhat sensitive to the timing, since the shooting process presumably allocates and frees some memory. If you are still logging get_meminfo, that could shed light on the question. If not, it would be worthwhile to do it for one of these long runs, to see if free memory trends down over a large number of shots. I predict it will.

It's also possible it could be caused by something else, like memory corruption or an error somewhere earlier that causes a leak.

Quote
In addition, if this really were an out-of-memory error, you most likely would not get the back-to-back failures that occurred twice with delay time of 1.4s+, especially immediately after a hand-clean of directories and cold restart. 
You are assuming the crash after cleaning generated the romlog. As best as I can tell, the evidence is against that, since romlogs were downloaded before and after the second crash both show the same crash. This is not two logs of an identical error, the are from the same event.
Don't forget what the H stands for.

*

Offline SticK

  • *****
  • 779
Re: CHDKPTP - PC Remote Control Performance Analysis
« Reply #225 on: 19 / September / 2012, 00:56:44 »
You may be right.  1.6s delay. Fail at 2297 myshoots (JPG & CanonCR2 per myshoot).  This one occurred after the 2 files were downloaded correctly (during the rm command).

Can you please decipher the romlog?

*

Offline reyalp

  • ******
  • 13673
Re: CHDKPTP - PC Remote Control Performance Analysis
« Reply #226 on: 19 / September / 2012, 00:57:01 »
Here is a build for s90 100c.

Also attached is a lua script to dump a ROM image.

To use it with chdkptp, do something like the following
upload it to the camera with
> u romdump.lua
then run it with
> =loadfile('A/romdump.lua')()
then download the ROM image with
> d PRIMARY.BIN S90_100C.BIN
and upload it to a file hosting site.

If you can do the same for the 101a camera, that would also help because the dump we have of that one isn't quite complete.

Thanks.
Don't forget what the H stands for.

*

Offline reyalp

  • ******
  • 13673
Re: CHDKPTP - PC Remote Control Performance Analysis
« Reply #227 on: 19 / September / 2012, 00:59:06 »
Can you please decipher the romlog?
This is the same malloc failure as the previous one. Definitely suggest tracking free memory on the next long run.
Don't forget what the H stands for.


*

Offline SticK

  • *****
  • 779
Re: CHDKPTP - PC Remote Control Performance Analysis
« Reply #228 on: 19 / September / 2012, 01:06:09 »
I had started a long run on the S90-100c camera a bit earlier, with 1.6s delay.  It has a C10 card which saves the file in ~3/4s, twice as fast as the S90-101a.

Do you want me to stop this one and restart on the 100c with memory sniff on?

*

Offline SticK

  • *****
  • 779
Re: CHDKPTP - PC Remote Control Performance Analysis
« Reply #229 on: 19 / September / 2012, 01:08:27 »
ROM Images // very tired // will get them tomorrow for you.

 

Related Topics