suba threading bug (was Re: 3D scanner with multiple IXUS160 cameras) - page 5 - General Discussion and Assistance - CHDK Forum

suba threading bug (was Re: 3D scanner with multiple IXUS160 cameras)

  • 77 Replies
  • 16576 Views
*

Offline philmoz

  • *****
  • 3332
    • Photos
Re: suba threading bug (was Re: 3D scanner with multiple IXUS160 cameras)
« Reply #40 on: 06 / October / 2017, 19:19:24 »
Advertisements
I'd guess a semaphore protecting suba would probably fix this.

Adding a semaphore seems to work (using the firmware functions CreateBinarySemaphore, TakeSemaphore & GiveSemaphore).

It does seem to slow things down a bit though - at least msgtest.lua seems slower to me.

Attached is a test build for SX150IS and the suba semaphore patch.

Phil.

Edit: I did some timing tests to see the effect of adding the semaphore (looping 100,000 times with a single call to malloc followed by free).

On my G12 the original code takes 3.2usec per loop iteration.
Adding the semaphore increases this to 7.7usec per iteration, with the additional safety checks on the free list it jumps to 9usec.
By rewriting and optimising the suba code I can get this down to 6.5usec (semaphore) or 7.8usec (semaphore & safety checks).
Since the problem doesn't see to be related to memory corruption I think the safety checks can be left out (although I would probably make them conditionally compilable into the code).

So we are looking at ~100% increase in the call time for malloc/free.
« Last Edit: 07 / October / 2017, 05:39:22 by philmoz »
CHDK ports:
  sx30is (1.00c, 1.00h, 1.00l, 1.00n & 1.00p)
  g12 (1.00c, 1.00e, 1.00f & 1.00g)
  sx130is (1.01d & 1.01f)
  ixus310hs (1.00a & 1.01a)
  sx40hs (1.00d, 1.00g & 1.00i)
  g1x (1.00e, 1.00f & 1.00g)
  g5x (1.00c, 1.01a, 1.01b)
  g7x2 (1.01a, 1.01b, 1.10b)

*

Offline philmoz

  • *****
  • 3332
    • Photos
Re: suba threading bug (was Re: 3D scanner with multiple IXUS160 cameras)
« Reply #41 on: 07 / October / 2017, 06:00:43 »
Updated patch for adding semaphore to suba.c, with the optimised suba code.
This also includes CreateBinarySemaphore for Digic6 cameras and updated stubs_entry.S files.


Phil.

CHDK ports:
  sx30is (1.00c, 1.00h, 1.00l, 1.00n & 1.00p)
  g12 (1.00c, 1.00e, 1.00f & 1.00g)
  sx130is (1.01d & 1.01f)
  ixus310hs (1.00a & 1.01a)
  sx40hs (1.00d, 1.00g & 1.00i)
  g1x (1.00e, 1.00f & 1.00g)
  g5x (1.00c, 1.01a, 1.01b)
  g7x2 (1.01a, 1.01b, 1.10b)

Re: suba threading bug (was Re: 3D scanner with multiple IXUS160 cameras)
« Reply #42 on: 07 / October / 2017, 09:39:16 »
Looking good folks!!

With 16 cameras, 50 repetitions (M mode / record) of:
  • preshoot->usb_sync_wait->open 5V switch->download->delete
worked fine, followed by 50 repetitions of
  • preshoot->usb_sync_wait->open 5V switch
and then
Code: [Select]
!mc:delete_images_list(mc:imglist(),{verbose=true})deleted all 800 images, in the one go, without fail.


Of some slight interest, perhaps, prior to this test (with Phil's V2 build) I had manually copied 500 images (copies of the same file) to a single camera (I had used http://www.bulkrenameutility.co.uk/Main_Intro.php to name them as IMG_0001.JPG through IMG_0500.JPG).

On executing:
Code: [Select]
!mc:delete_images_list(mc:imglist(),{verbose=true})nothing was returned to the CLI but there was no apparent crash. Having assumed the camera had entered some recursive loop (somewhere), or something?, I switched it off and, manually, deleted 400 images from the card and reran the !mc:delete command which then did start to provide cli feedback but crashed, unsurprisingly, at the 97th delete, cli trace of that here:

Code: [Select]
___> !mc=require('multicam')
___> !mc:connect({list='C:/CHDKPTP/listfile'})
+ 1:Canon PowerShot SX150 IS b=\\.\libusb0-0001--0x04a9-0x3234 d=bus-0 s=D10AC15
E25614FEBB5AF36C058AE46E1
___> !mc:start()
___> !return mc:cmdwait('rec')
rec
=true,{
 [10]={
  done=true,
  status={
   status=true,
   cmd="rec",
  },
  failed=false,
 },
}
___> !mc:delete_images_list(mc:imglist(),{verbose=true})
os.remove("A/DCIM/156___10/IMG_0001.JPG")
os.remove("A/DCIM/156___10/IMG_0002.JPG")
os.remove("A/DCIM/156___10/IMG_0003.JPG")
os.remove("A/DCIM/156___10/IMG_0004.JPG")
os.remove("A/DCIM/156___10/IMG_0005.JPG")
os.remove("A/DCIM/156___10/IMG_0006.JPG")
os.remove("A/DCIM/156___10/IMG_0007.JPG")
os.remove("A/DCIM/156___10/IMG_0008.JPG")
os.remove("A/DCIM/156___10/IMG_0009.JPG")
os.remove("A/DCIM/156___10/IMG_0010.JPG")
os.remove("A/DCIM/156___10/IMG_0011.JPG")
os.remove("A/DCIM/156___10/IMG_0012.JPG")
os.remove("A/DCIM/156___10/IMG_0013.JPG")
os.remove("A/DCIM/156___10/IMG_0014.JPG")
os.remove("A/DCIM/156___10/IMG_0015.JPG")
os.remove("A/DCIM/156___10/IMG_0016.JPG")
os.remove("A/DCIM/156___10/IMG_0017.JPG")
os.remove("A/DCIM/156___10/IMG_0018.JPG")
os.remove("A/DCIM/156___10/IMG_0019.JPG")
os.remove("A/DCIM/156___10/IMG_0020.JPG")
os.remove("A/DCIM/156___10/IMG_0021.JPG")
os.remove("A/DCIM/156___10/IMG_0022.JPG")
os.remove("A/DCIM/156___10/IMG_0023.JPG")
os.remove("A/DCIM/156___10/IMG_0024.JPG")
os.remove("A/DCIM/156___10/IMG_0025.JPG")
os.remove("A/DCIM/156___10/IMG_0026.JPG")
os.remove("A/DCIM/156___10/IMG_0027.JPG")
os.remove("A/DCIM/156___10/IMG_0028.JPG")
os.remove("A/DCIM/156___10/IMG_0029.JPG")
os.remove("A/DCIM/156___10/IMG_0030.JPG")
os.remove("A/DCIM/156___10/IMG_0031.JPG")
os.remove("A/DCIM/156___10/IMG_0032.JPG")
os.remove("A/DCIM/156___10/IMG_0033.JPG")
os.remove("A/DCIM/156___10/IMG_0034.JPG")
os.remove("A/DCIM/156___10/IMG_0035.JPG")
os.remove("A/DCIM/156___10/IMG_0036.JPG")
os.remove("A/DCIM/156___10/IMG_0037.JPG")
os.remove("A/DCIM/156___10/IMG_0038.JPG")
os.remove("A/DCIM/156___10/IMG_0039.JPG")
os.remove("A/DCIM/156___10/IMG_0040.JPG")
os.remove("A/DCIM/156___10/IMG_0041.JPG")
os.remove("A/DCIM/156___10/IMG_0042.JPG")
os.remove("A/DCIM/156___10/IMG_0043.JPG")
os.remove("A/DCIM/156___10/IMG_0044.JPG")
os.remove("A/DCIM/156___10/IMG_0045.JPG")
os.remove("A/DCIM/156___10/IMG_0046.JPG")
os.remove("A/DCIM/156___10/IMG_0047.JPG")
os.remove("A/DCIM/156___10/IMG_0048.JPG")
os.remove("A/DCIM/156___10/IMG_0049.JPG")
os.remove("A/DCIM/156___10/IMG_0050.JPG")
os.remove("A/DCIM/156___10/IMG_0051.JPG")
os.remove("A/DCIM/156___10/IMG_0052.JPG")
os.remove("A/DCIM/156___10/IMG_0053.JPG")
os.remove("A/DCIM/156___10/IMG_0054.JPG")
os.remove("A/DCIM/156___10/IMG_0055.JPG")
os.remove("A/DCIM/156___10/IMG_0056.JPG")
os.remove("A/DCIM/156___10/IMG_0057.JPG")
os.remove("A/DCIM/156___10/IMG_0058.JPG")
os.remove("A/DCIM/156___10/IMG_0059.JPG")
os.remove("A/DCIM/156___10/IMG_0060.JPG")
os.remove("A/DCIM/156___10/IMG_0061.JPG")
os.remove("A/DCIM/156___10/IMG_0062.JPG")
os.remove("A/DCIM/156___10/IMG_0063.JPG")
os.remove("A/DCIM/156___10/IMG_0064.JPG")
os.remove("A/DCIM/156___10/IMG_0065.JPG")
os.remove("A/DCIM/156___10/IMG_0066.JPG")
os.remove("A/DCIM/156___10/IMG_0067.JPG")
os.remove("A/DCIM/156___10/IMG_0068.JPG")
os.remove("A/DCIM/156___10/IMG_0069.JPG")
os.remove("A/DCIM/156___10/IMG_0070.JPG")
os.remove("A/DCIM/156___10/IMG_0071.JPG")
os.remove("A/DCIM/156___10/IMG_0072.JPG")
os.remove("A/DCIM/156___10/IMG_0073.JPG")
os.remove("A/DCIM/156___10/IMG_0074.JPG")
os.remove("A/DCIM/156___10/IMG_0075.JPG")
os.remove("A/DCIM/156___10/IMG_0076.JPG")
os.remove("A/DCIM/156___10/IMG_0077.JPG")
os.remove("A/DCIM/156___10/IMG_0078.JPG")
os.remove("A/DCIM/156___10/IMG_0079.JPG")
os.remove("A/DCIM/156___10/IMG_0080.JPG")
os.remove("A/DCIM/156___10/IMG_0081.JPG")
os.remove("A/DCIM/156___10/IMG_0082.JPG")
os.remove("A/DCIM/156___10/IMG_0083.JPG")
os.remove("A/DCIM/156___10/IMG_0084.JPG")
os.remove("A/DCIM/156___10/IMG_0085.JPG")
os.remove("A/DCIM/156___10/IMG_0086.JPG")
os.remove("A/DCIM/156___10/IMG_0087.JPG")
os.remove("A/DCIM/156___10/IMG_0088.JPG")
os.remove("A/DCIM/156___10/IMG_0089.JPG")
os.remove("A/DCIM/156___10/IMG_0090.JPG")
os.remove("A/DCIM/156___10/IMG_0091.JPG")
os.remove("A/DCIM/156___10/IMG_0092.JPG")
os.remove("A/DCIM/156___10/IMG_0093.JPG")
os.remove("A/DCIM/156___10/IMG_0094.JPG")
os.remove("A/DCIM/156___10/IMG_0095.JPG")
os.remove("A/DCIM/156___10/IMG_0096.JPG")
ERROR: call failed:I/O error
___> !mc:camstatus()
#1 NOT connected

Summary: 1 cameras (0 connected, 1 disconnected)
___>

In any case,

thanks for all your help  :)
« Last Edit: 07 / October / 2017, 14:08:36 by andrew.stephens.754365 »

*

Offline reyalp

  • ******
  • 13440
Re: suba threading bug (was Re: 3D scanner with multiple IXUS160 cameras)
« Reply #43 on: 07 / October / 2017, 16:02:34 »
On my G12 the original code takes 3.2usec per loop iteration.
Adding the semaphore increases this to 7.7usec per iteration, with the additional safety checks on the free list it jumps to 9usec.
I'm not very worried about the performance of malloc. Outside of Lua, I don't thing we do a significant amount of dynamic memory allocation in performance critical code. Lua has enough other overhead that I doubt this is significant.

Purely out of curiosity, it might be interesting to know how OS malloc compares.

Quote
Since the problem doesn't see to be related to memory corruption I think the safety checks can be left out (although I would probably make them conditionally compilable into the code).
Having them conditionally compilable at least seems like a good idea. I wouldn't be averse to having them enabled all the time either.

I have also been thinking about using asserts more in CHDK code... maybe a topic for another thread.

@andrew.stephens.754365 did the crash while deleting generate a romlog?

As usual, the chdkptp output just says the connection went away.
Don't forget what the H stands for.


*

Offline philmoz

  • *****
  • 3332
    • Photos
Re: suba threading bug (was Re: 3D scanner with multiple IXUS160 cameras)
« Reply #44 on: 07 / October / 2017, 16:47:09 »
Purely out of curiosity, it might be interesting to know how OS malloc compares.


4.8usec per iteration.


Quote
Quote
Since the problem doesn't see to be related to memory corruption I think the safety checks can be left out (although I would probably make them conditionally compilable into the code).
Having them conditionally compilable at least seems like a good idea. I wouldn't be averse to having them enabled all the time either.


Because this is a significant code rewrite, I want to do some more testing on the suba code before committing this to SVN.
Any additional review/testing/comments would be welcome (my time is limited these days).


Phil.
CHDK ports:
  sx30is (1.00c, 1.00h, 1.00l, 1.00n & 1.00p)
  g12 (1.00c, 1.00e, 1.00f & 1.00g)
  sx130is (1.01d & 1.01f)
  ixus310hs (1.00a & 1.01a)
  sx40hs (1.00d, 1.00g & 1.00i)
  g1x (1.00e, 1.00f & 1.00g)
  g5x (1.00c, 1.01a, 1.01b)
  g7x2 (1.01a, 1.01b, 1.10b)

Re: suba threading bug (was Re: 3D scanner with multiple IXUS160 cameras)
« Reply #45 on: 07 / October / 2017, 16:56:05 »
did the crash while deleting generate a romlog?

Yes, i'm almost entirely sure the attached was it.

*

Offline philmoz

  • *****
  • 3332
    • Photos
Re: suba threading bug (was Re: 3D scanner with multiple IXUS160 cameras)
« Reply #46 on: 07 / October / 2017, 17:09:56 »
did the crash while deleting generate a romlog?

Yes, i'm almost entirely sure the attached was it.


Can you please try the 500 image copy/delete with this build.


There may have been an issue with the previous build (or there could still be other problem).
I'll try it on my G12 later today as well.


Phil.

CHDK ports:
  sx30is (1.00c, 1.00h, 1.00l, 1.00n & 1.00p)
  g12 (1.00c, 1.00e, 1.00f & 1.00g)
  sx130is (1.01d & 1.01f)
  ixus310hs (1.00a & 1.01a)
  sx40hs (1.00d, 1.00g & 1.00i)
  g1x (1.00e, 1.00f & 1.00g)
  g5x (1.00c, 1.01a, 1.01b)
  g7x2 (1.01a, 1.01b, 1.10b)

Re: suba threading bug (was Re: 3D scanner with multiple IXUS160 cameras)
« Reply #47 on: 07 / October / 2017, 17:18:37 »
Will do that first thing tomorrow Phil - nearly bedtime here.

Thanks,

Andrew.


*

Offline philmoz

  • *****
  • 3332
    • Photos
Re: suba threading bug (was Re: 3D scanner with multiple IXUS160 cameras)
« Reply #48 on: 07 / October / 2017, 21:40:28 »
Of some slight interest, perhaps, prior to this test (with Phil's V2 build) I had manually copied 500 images (copies of the same file) to a single camera (I had used http://www.bulkrenameutility.co.uk/Main_Intro.php to name them as IMG_0001.JPG through IMG_0500.JPG).

On executing:
Code: [Select]
!mc:delete_images_list(mc:imglist(),{verbose=true})nothing was returned to the CLI but there was no apparent crash.




Same behaviour on my G12
- 459 images on the card, no output from mc:delete_images_list, chdkptp appears to hang.
- 150 images on the card, no output from mc:delete_images_list, returns immediately to the chdkptp command prompt
- 100 images on the card, images deleted successfully.


I did not get any crashes doing this on my G12 with the new suba code.


Phil.
CHDK ports:
  sx30is (1.00c, 1.00h, 1.00l, 1.00n & 1.00p)
  g12 (1.00c, 1.00e, 1.00f & 1.00g)
  sx130is (1.01d & 1.01f)
  ixus310hs (1.00a & 1.01a)
  sx40hs (1.00d, 1.00g & 1.00i)
  g1x (1.00e, 1.00f & 1.00g)
  g5x (1.00c, 1.01a, 1.01b)
  g7x2 (1.01a, 1.01b, 1.10b)

*

Offline reyalp

  • ******
  • 13440
Re: suba threading bug (was Re: 3D scanner with multiple IXUS160 cameras)
« Reply #49 on: 08 / October / 2017, 00:37:31 »
Because this is a significant code rewrite, I want to do some more testing on the suba code before committing this to SVN.
Any additional review/testing/comments would be welcome (my time is limited these days).
Mine too, thanks to taking the lead on this. I'll try to take look at the code and do some testing.

One initial thought: Should the takeSemaphore have a timeout? IIRC (referring back to https://chdk.setepontos.com/index.php?topic=9986.msg113358#msg113358) 0 is forever, and it would probably be better to crash/assert than just totally freeze. Actually doing the suba operations should take negligible time, though the task switches could occur while a task was in suba code, so one it would be reasonable to expect the semaphore to be held for hundreds of ms, but probably not multiple seconds.

I was going to say we should test vxworks, but I don't think any are configured to use suba by default. A610 is only one I saw with exmem related stuff in makefile.inc

FWIW, I'd be fine with just protecting the existing suba malloc/free/meminfo with a semaphore without any optimization or additional error checking.

Quote
I did not get any crashes doing this on my G12 with the new suba code.
Meaning 500 images deleted without problems?
Don't forget what the H stands for.

 

Related Topics