1.4.19d1 transfer issues

  • I wanted to come back to an issue first brought up in the 1.4.19d1 thread.

    When we are transferring images from Conquest 1.4.19b we can send things to a GE linux pacs server without any problems

    But when going to 1.4.19d1 we see the following appear in the logs and transfers are cancelled:

    Quote

    2019-06-19 15:58:57 EDT ERROR-|DicomServiceTemplate:271| Exception in creating dicom dataSpecified length (27269900) of PDU exceeds limit: 1048576

    java.io.IOException: Specified length (27269900) of PDU exceeds limit: 1048576

    Reverting back to the old version fixes the problem again.

    To try and see what might be going on, I took some captures with Wireshark and I'm noticing a difference in the traffic flow.

    While the old version is putting in small PDU segments, the new one seems to throw everything under 1 large transfer as far as I can tell.

    Screenshots attached so someone more knowledgeable can have a look and hopefully let me know what is going on here.


    Let me know if more info is needed.


    Thanks in advance

  • The d2 release is acting like the b version again.

    So our normal images are transferring.

    I'm still having an issue transferring large BTO images from Conquest to a new cloud gateway we are dealing with.

    It is failing due to a read time-out on the gateway side.


    in 1.4.19b and now 1.4.19d2 this is behaving the same way now.

    I attached an image of the traffic captured during the transfer.

    All of a sudden, there is a gap there, which causes the receiving end to time-out and cancel the transfer.


    In Dicom there is nothing in the logs even with debug set to the max level.

    Both machines are on the same virtual hosts, so there is no packet loss or firewall rules in play here.

    I even turned all the offloading options, ... off on the NIC to ensure that isn't interfering.


    Since we do not have any other issues with other vendors and other systems, I always thought it was the receiving end, but seeing the wireshark captures now is not showing any new images being sent from Conquest, so guessing it might be the sending side now.


    Anything you can see or think off?

  • At the blue mark on the screenshot, is where the remote server sent an abort command due to time-out.

    Before the abort command it looks to me that one of the images finishes transferring (line 14984)

    After which the next one starts (line 15985)

    Followed by the transfer and ackn of 1 packet, after which everything stops until the remote server times out and cancels the transfer.

    Looks like Conquest just stops sending for some reason, but can't find anything in its logs, not even with full debug enabled

  • Yes, the studies we are sending are 3D mammograms, so around 800MB a piece or more.

    They are being converted from j2 to ul or un.

    When I try to pass them off as j2 to the receiving party it still does a conversion to j1.

    So there is always a conversion going on, but I didn't think the decompressing of a single image took 15 seconds, causing the timeout. The server has enough horsepower to do it pretty quickly. Is there anyway to check on those timings?

  • We can send the same images between multiple Conquest servers, to KPACS and IQview without any problems.

    Just to this one server it is causing issues

    I tried even doubling the server memory to 32GB, but doesn't make a difference. Not coming anywhere close to using all the memory.


    This is the debug output:


    Looks from here the decompress might be taking longer than the timeout on the receiving end.

    While the job is running one of the CPU cores seems to be maxed out and some memory being used (some being 1-3GB) -> see screenshot attached

    So the first spike in traffic seems to be the first image which converts really quickly, followed by nothing while it is waiting on image conversion if I'm reading this correct. And the receiving end times out while waiting.

    Guessing there is no flag for multi-core compressing and decompressing?

    Also tried disabling read-ahead since I thought it might have been converting multiple things in memory, but that doesn't seem to be the case and did not help.


    I'm guessing my options now are to ask the receiving end to increase their time-outs or to invest in a faster CPU.

  • Hi,


    16 seconds is a really short timeout period. Conquest is default set to 300 s. Conquest is single threaded - leaving multiple threads to deal with simultaneous connections. So there are little options to accelerate the code. Also jpeg is quite fast, see graph (for 512x512 image, write/read time, full scale is 0.12s). JpegLS is slower and Jpeg2000 lots slower. NKI compression is faster but not dicom compliant.



    Marcel

    Marcel van Herk is developer of the Conquest DICOM server together with Lambert Zijp.