CHDK PTP interface - page 116 - General Discussion and Assistance - CHDK Forum supplierdeeply

CHDK PTP interface

  • 1241 Replies
  • 487526 Views
*

Offline reyalp

  • ******
  • 14079
Re: CHDK PTP interface
« Reply #1150 on: 18 / March / 2019, 00:56:36 »
Advertisements
I ran into a weird bug on D10.
PTP commands that send specific lengths of data fail.
Some data (seems to be consistently 5032 bytes in my tests) is received, but the remainder is mostly filled with other junk from RAM. The PTP command then fails with I/O error on the chdkptp side, after a significant time.
Aside from the stalled PTP connection, everything seems fine with the camera, it can be reconnected after issuing
Code: [Select]
!chdk.reset_device(con.condev)
This can happen with file upload, or sending messages. For files, the length includes the filename and 4 bytes for the filename length in addition to the data itself. Messages send exactly the message content length. I assume other bulk sends are affected as well.

None of my other cameras (a540, sx160, elph130, g7x, sx710, elph180) appear to be affected.

Lengths of data that fail (tested exhaustively with extras/msgtest.lua) are
0x23f4
0x25f4
0x27f4
0x29f4
0x2bf4
0x2df4
0x2ff4
0x31f4
...and further increments of 512

I only ran into this because a file I was trying to upload ended up with name info and contents being exactly 0x2bf4. This happens on multiple different hubs from a PC and on a raspberry pi, so it's not a host or hub issue.

The sizes above are not close to the limit of available memory, and it happens with or without a script running, so it's not dependent on some exact memory layout on the camera.

A simple way to test for the issue is:
Code: [Select]
!con:write_msg(('x'):rep(0x23f4))
If the bug isn't present, you will immediately get
Code: [Select]
ERROR: call failed:no script running
If it is present, it will hang for several seconds and you will get:
Code: [Select]
ERROR: call failed:I/O error

To exhaustively test a range of sizes, you can use something like
Code: [Select]
!m=require'extras/msgtest'
!m.test{size=1,sizemax=0x3000,sizeinc=1,gc='step'}
It will stop if it hits the error.
Note that prior to chdkptp r846 it would stop one before sizemax.

I would be interested to know if cameras similar to D10 are affected (DryOS 31, 2009, Digic IV)

This is somewhat reminiscent of the old digic 2 alignment bug: https://chdk.setepontos.com/index.php?topic=6730.msg76676#msg76676 but the specifics are obviously different.
Don't forget what the H stands for.

*

Offline srsa_4c

  • ******
  • 4451
Re: CHDK PTP interface
« Reply #1151 on: 18 / March / 2019, 19:10:21 »
I would be interested to know if cameras similar to D10 are affected (DryOS 31, 2009, Digic IV)
Tested my r31 and r23 cams. The one affected is the ixus110/sd960 (r31). Unaffected: sx100 (r23), a470 (r23), g10 (r31), ixus870/sd880 (r31). The a470 crashed once, but not at those sizes and I could not reproduce.

*

Offline reyalp

  • ******
  • 14079
Re: CHDK PTP interface
« Reply #1152 on: 23 / March / 2019, 17:15:27 »
I would be interested to know if cameras similar to D10 are affected (DryOS 31, 2009, Digic IV)
Tested my r31 and r23 cams. The one affected is the ixus110/sd960 (r31). Unaffected: sx100 (r23), a470 (r23), g10 (r31), ixus870/sd880 (r31). The a470 crashed once, but not at those sizes and I could not reproduce.
Thanks. I assume you were using the msgtest test when a470 crashed? If there's a romlog, I'd be interested to see it.

> r31 cams could be interesting too, after D10, the next oldest I have is SX160 (DryOS 51, digic IV, 2012)
Don't forget what the H stands for.

*

Offline srsa_4c

  • ******
  • 4451
Re: CHDK PTP interface
« Reply #1153 on: 23 / March / 2019, 19:01:45 »
I assume you were using the msgtest test when a470 crashed?
Yes. Romlog attached (a470 102c) (edit: date is wrong, of course). My theory is that one of the Canon libraries had a bug that was relatively quickly fixed and only affected a few models.

Quote
> r31 cams could be interesting too, after D10, the next oldest I have is SX160 (DryOS 51, digic IV, 2012)
a490 (r43), a3200 (r47): no failures, tested range 0x2300...0x3000.


*

Offline reyalp

  • ******
  • 14079
Re: CHDK PTP interface
« Reply #1154 on: 24 / March / 2019, 02:18:52 »
Yes. Romlog attached (a470 102c) (edit: date is wrong, of course).
Thanks. That's deep in the Canon PTP code, which suggests it could be an actual PTP problem, rather than say, running out of memory or some other unrelated CHDK issue.

Quote
My theory is that one of the Canon libraries had a bug that was relatively quickly fixed and only affected a few models.
Or chdkptp is doing something not quite correct that other versions tolerate. There are some weird rules about sending a zero-length packets when the data length is an exact multiple of the max packet size.

Looking for ways to debug this, I was reminded there appears to be a separate logging system connected to PTP, which maybe writes to "A/BulkTrns.Log"

The logging function is FF9F7AF0 on d10 (sub_FFD9A06C on your a470)

I haven't yet figured out how to enable it. figured it out while writing this post ;)  It appears to be set up in sub_FFD9A268 based on a call to FF9FA8F8

This seems to be checking some kind of system with named values, but "PtpBulkLogFlag" is only mentioned in that one place, so it's not clear how to set it. Maybe it's compile time in Canon code. Other names used by the same system (PlatformWrapper.c apparently) are referenced in other places, e.g "PTPConfig.SupportedSlideShow", "PTPConfig.VenderExtensionVerrsion", "PTP GetDeviceInfo Configuration"

There appears to be 3 functions associated with this system:
A set / register function sub_FF9FA7D0 ("name",value)
Value seems to be a pointer or arbitrary constant
A clear / unset function sub_FF9FA85 ("name")
and the above mentioned get function
status =  sub_FF9FA8F8("name",*value)
which returns 0 if value is present, or has bit 1 set if not. The original value saved is returned in *value.

The name, value pairs are stored sequentially in memory, starting at 0x785A4 on d10.

Based on the code that checks "PtpBulkLogFlag" the value doesn't matter, as long as it's not 0.

Putting this all together, I made a Lua script that enables the bulk log flag. This appears to only take effect on next physical USB connection (i.e. USB bit being set) and isn't flushed until disconnection. The log is appended between session. The function pointer values are obviously camera specific, but if they turn out to be useful, they should be easy to add to the sigfinders.

In the failed case, the log looks like this
Code: [Select]
00024920: Op.[0x1002], P1[0x00000001]

00024930: Res.Comp

00024930: Op.[0x1001]

00024930: SendData,Len=385

00024930: Res.Comp

00024930: Op.[0x9999], P1[0x00000000]

00024940: Res.Comp

00035250: Op.[0x9999], P1[0x0000000b]

00035250: ReceiveData,Len=9204

00035250: WaitNullPacket fNullEnd = 1, Total=9216, Current=512, SendLen=87
00048970: DoTransactionCanceled()
Op 0x9999 P1 = 0xb is PTP_CHDK_WriteScriptMsg
Don't forget what the H stands for.

*

Offline srsa_4c

  • ******
  • 4451
Re: CHDK PTP interface
« Reply #1155 on: 24 / March / 2019, 17:02:22 »
Tried 2 cams, one with the issue, one without.
Enabled the ptp log via the script, connected usb, pressed "connect" on chdkptp gui, executed the !con:write_msg(('x'):rep(0x23f4)) one-liner, after a few seconds pressed "disconnect", and finally I disconnected usb before shutting down the cam.
g10, has no issue:
Code: [Select]
00028010: Op.[0x9999], P1[0x0000000a]
00028010: SendData,Len=758
00028010: Res.Comp
00028020: Op.[0x9999], P1[0x00000008]
00028020: Res.Comp
00028020: Op.[0x9999], P1[0x0000000a]
00028020: SendData,Len=4
00028030: Res.Comp
00028030: Op.[0x9999], P1[0x00000008]
00028040: Res.Comp
00033370: Op.[0x9999], P1[0x0000000b]
00033370: ReceiveData,Len=9204
00033370: WaitNullPacket fNullEnd = 1, Total=9216, Current=512, SendLen=87
00033370: Res.Comp
00041990: Op.[0x1003]
00041990: Res.Comp
00042050: DoTransactionCanceled()
00044170: DoTransactionCanceled()
ixus110, has the issue:
Code: [Select]
00016630: Op.[0x9999], P1[0x00000008]
00016630: Res.Comp
00016790: Op.[0x9999], P1[0x00000008]
00016790: Res.Comp
00016790: Op.[0x9999], P1[0x0000000a]
00016790: SendData,Len=667
00016790: Res.Comp
00016800: Op.[0x9999], P1[0x00000008]
00016800: Res.Comp
00016800: Op.[0x9999], P1[0x0000000a]
00016800: SendData,Len=4
00016800: Res.Comp
00016810: Op.[0x9999], P1[0x00000008]
00016810: Res.Comp
00022040: Op.[0x9999], P1[0x0000000b]
00022040: ReceiveData,Len=9204
00022040: WaitNullPacket fNullEnd = 1, Total=9216, Current=512, SendLen=87
00036370: DoTransactionCanceled()
00038320: DoTransactionCanceled()
On the Ixus, USB communication after the test message is disrupted. I'm not 100% sure, but I think the first DoTransactionCanceled() appears after the (unsuccessful?) chdkptp disconnection attempt, the second one comes after the USB physical disconnection.
On the g10, I'd guess Op.[0x1003] initiates the disconnection and DoTransactionCanceled() appears when disconnect is done and again when the cam is physically disconnected.

Would it be possible to send that null packet?

*

Offline reyalp

  • ******
  • 14079
Re: CHDK PTP interface
« Reply #1156 on: 24 / March / 2019, 17:59:10 »
Some additional notes (written at the same time as srsa_4c...):
Output on non-affected cam is similar. Elph130, with 0x23f3 and 0x23f4 length messages:
Code: [Select]
00026150: Op.[0x1002], P1[0x00000001]
00026150: Res.Comp
00026150: Op.[0x1001]
00026150: SendData,Len=417
00026160: Res.Comp
00026160: Op.[0x9999], P1[0x00000000]
00026160: Res.Comp
00040640: Op.[0x9999], P1[0x0000000b]
00040640: ReceiveData,Len=9203
00040640: Res.Comp
00051980: Op.[0x9999], P1[0x0000000b]
00051980: ReceiveData,Len=9204
00051980: WaitNullPacket fNullEnd = 1, Total=9216, Current=512, SendLen=87
00051980: Res.Comp
00055710: Op.[0x1003]
00055710: Res.Comp
00055720: DoTransactionCanceled()
00058660: DoTransactionCanceled()
(1001 = PTP_OC_GetDeviceInfo, 1002 = PTP_OC_OpenSession, 1003 = PTP_OC_CloseSession)

Total is +12 bytes for the request, which makes the failed cases 9216 = 18*512.
I'm not clear what the SendLen=87 refers to.
Also noteworthy multiples of 512 from 2 to 17 trigger the WaitNullPacket message without issue. I don't see anything obvious in chdkptp that would depend on whether the size was >= 18*512. The chdkptp code does *not* break writes in to 512 byte chunks. It does send the null packet when the total is a multiples of 512. This logic is unchanged from ptpcam.

chdkptp side logging on D10
Code: [Select]
con> !con:write_msg(('x'):rep(512*18-12))
ptp_usb_sendreq trans=5 len=0x14 (20) code=0x9999 nparam=2 param1=0x0000000b
USB_BULK_WRITE returned 20, size=20
ptp_usb_senddata trans=5 len=0x2400 (9216) code=0x9999
USB_BULK_WRITE returned 512, size=512
USB_BULK_WRITE returned 8704, size=8704
USB_BULK_WRITE returned 0, size=0
usb_bulk_read: No error
ERROR: call failed:I/O error
Elph130 is similar except it except it doesn't end with IO error.

On separate topic, the "PlatformWrapper" code could potentially give us a way to override other things, like the DeviceInfo information or supported events.
Don't forget what the H stands for.

*

Offline reyalp

  • ******
  • 14079
Re: CHDK PTP interface
« Reply #1157 on: 24 / March / 2019, 19:16:19 »
On the Ixus, USB communication after the test message is disrupted. I'm not 100% sure, but I think the first DoTransactionCanceled() appears after the (unsuccessful?) chdkptp disconnection attempt, the second one comes after the USB physical disconnection.
On the g10, I'd guess Op.[0x1003] initiates the disconnection and DoTransactionCanceled() appears when disconnect is done and again when the cam is physically disconnected.
I think the double DoTransactionCanceled() in both cases is due to the chdkptp disconnection code issuing usb_reset(). Commenting this this out seems to eliminate it, without obvious ill effect. (I've actually meant to make this optional for a long time...)

Quote
Would it be possible to send that null packet?
The log in my previous post seems to indicate that it is sent, and the whole sequence succeeds from the host POV, and the I/O error comes from reading the response. I guess ptp.c ptp_usb_senddata could be modified to send another one if the data is the magic length (to avoid messing up the transactions required to make the connection).
Don't forget what the H stands for.


*

Offline reyalp

  • ******
  • 14079
Re: CHDK PTP interface
« Reply #1158 on: 25 / March / 2019, 00:22:15 »
Some notes on capturing USB traffic. This wasn't immediately productive for the current issue, but I'm writing it down while it's fresh in my mind.

I did this on linux (ubuntu 18.04) because wireshark has some scary warnings on the windows USB capture driver.

Attempting to capture directly from wireshark failed with a permission issue.  It suggested
Code: [Select]
sudo dpkg-reconfigure wireshark-common
... check allow regular users to capture...
sudo usermod -a -G wireshark <user>
and re-logging in, but this didn't resolve the issue.

I captured as root instead using the command line program dumpcap instead

The capture interfaces are per bus, so usbmon1 is for devices on bus 1. While dumpcap has a filter parameter, it's not clear to me if or how you could filter to a specific USB device at capture time.

To capture everything on bus 1 to a file
Code: [Select]
dumpcap -i usbmon1 -w file.pcap
I could then open the file in wireshark.
To filter traffic involving the device (bus 1, device 9) I used
Code: [Select]
usb.addr contains "1.9"
The actual destination will show up as 1.9.x where x is the endpoint for the specific traffic. addr means it matches either source or destination.

This was successful and shows the requests, but wireshark doesn't appear to include a dissector for USB PTP (there is one for PTP/IP, and some work on a USB one a long time ago https://wiki.wireshark.org/USB-PTP https://bugs.wireshark.org/bugzilla/show_bug.cgi?id=8431).

The > 512 writes show up as single USB_BULK packets. This is also seen on camera to host traffic. I assume we just aren't seeing things at the level that would show it broken up into separate packets.

edit:
I tried making ptp_usb_write_func call USB_BULK_WRITE with no more than 512 at a time, but this had no effect on the problem or the BulkTrns.Log output.
« Last Edit: 25 / March / 2019, 01:29:17 by reyalp »
Don't forget what the H stands for.

*

Offline reyalp

  • ******
  • 14079
Re: CHDK PTP interface
« Reply #1159 on: 25 / March / 2019, 02:09:48 »
I guess ptp.c ptp_usb_senddata could be modified to send another one if the data is the magic length (to avoid messing up the transactions required to make the connection).
I tried this, and the write for the second NULL packet timed out.

The error we normally see is the read for get_resp timing out. So it appears the camera state gets hosed during the writes, even though both the data write and null packet appear to succeed from the host POV.

From the camera side, we don't actually check check status on the data transfer function data->recv_data... I sometimes returns partial results, and is just waiting to be called again? At the start of this, I noticed that the data was only partially received.

edit:
recv_data returns 1 in the failed case, 0 on others.


The buf_size logic in recv_ptp_data is also weird.
« Last Edit: 25 / March / 2019, 02:46:43 by reyalp »
Don't forget what the H stands for.

 

Related Topics