DATALOSS Advisories

DATALOSS advisories indicate network transmission problems.     These advisories point to situations that defeat the Rendezvous     reliable delivery protocols and require investigation.

.PTP indicates that a point-to-point message was lost.

.BCAST indicates that a broadcast message was lost.

DATALOSS.OUTBOUND

Clients of the sending daemon present DATALOSS.OUTBOUND     advisories

Example:

{ADV_CLASS="ERROR" ADV_SOURCE="SYSTEM"     ADV_NAME="DATALOSS.OUTBOUND.BCAST" ADV_DESC="dataloss: remote     daemon asking for retransmission after we timed out the data"     host="xx.xxx.x.xxx" lost=y}

DATALOSS.INBOUND

Clients of the receiving daemon present DATALOSS.INBOUND     advisories.

Example:

{ADV_CLASS="ERROR" ADV_SOURCE="SYSTEM"     ADV_NAME="DATALOSS.INBOUND.PTP" ADV_DESC="dataloss: remote     daemon did not satisfy our retransmission requests"     host="xx.x.x.xxx" lost=y}

FAQs

  •      
  • Is the IP address in the advisory message for the       sending or receiving daemon?   

For     DATALOSS.OUTBOUND advisories, the IP address will be the host     of the receiving daemon and for DATALOSS.INBOUND advisories,     the IP address will be the host of the sending daemon - the IP     address will be the 'other' host involved.

  •      
  • How to find out on which subject data has been       lost?   

As DATALOSS implies     that we do not see the packet (it is based on a sequence number     only), determining the subject for which the data has been lost     is not possible.

DATALOSS advisories     can only inform of a packet 'gap' and an associated source IP     address, but not a subject.

Packets are     assembled into messages, or messages are stripped out of     packets. Once all packets for a message are present, a routine     is used that retrieves the subject from the message. If a     packet(s) is missed, and it is not possible to recover it/them     due to data loss, then it is not possible to know what the     subjects were that relate to the data loss.

Additionally, the     sending transport (client) cannot be identified, as the     protocol is, by design, anonymous. If non-anonymous     (identified) behaviour is required, certified messaging can be     used which identifies the transport (application endpoint) with     a unique identifier.

  •      
  • What does the ‘lost’ field in a       DATALOSS advisory message represent?   

The lost parameter     represents the total number of packets lost.

  •      
  • My Rendezvous daemon is running with a       reliability parameter of 10 seconds instead of the       recommended 60 seconds. Could this be related to the DATALOSS       advisories I am seeing?   

Yes. Decreasing     retention time decreases reliability and increases the     probability of lost data.

Document References

RV Concepts manual, Appendix A: System Advisory Messages:     DATALOSS

Troubleshooting

The ADV_DESC field of a DATALOSS advisory message can     provide some further details about why dataloss has     occurred.

  •      
  • The following       description is included in the DATALOSS advisory:       ADV_DESC="dataloss: unable to interpret incoming packet       host=xxx.xxx.xx.xx" - what does this mean?   

The above error     indicates that there is dataloss due to the reason that     Rendezvous cannot interpret the incoming packets from host id     (xxx.xxx.xx.xx). If a message is not sent from a Rendezvous     application, you could potentially receive this error.

  •      
  • I am seeing one or       more of the following descriptions in the ADV_DESC field of       the DATALOSS advisory.   

ADV_DESC="dataloss: remote daemon     already timed out the data"

ADV_DESC="dataloss: remote daemon     did not acknowledge our transmission"

ADV_DESC="dataloss: remote daemon     asking for retransmission after we timed out the data"

ADV_DESC="dataloss: remote daemon     did not satisfy our retransmission request(s)"

These descriptions indicate that     data has been lost because the sending daemon no longer retains     it.

  •      
  • Potential causes of       DATALOSS   

There can be several causes for the     receipt of DATALOSS advisories:

- Some hardware component is     experiencing intermittent failure, for example: a faulty     network card, loose connection, frayed wire....

- The network can be saturated.

- A daemon process is starved for     CPU cycles, that is the computer is too heavily loaded or the     priority of the daemon process is too low.

- The daemon is running with a     "-reliability" parameter lower than 60 seconds.

Information to be sent to TIBCO Support

Please open a SR with TIBCO Support     and upload the following:

1. Run "iniftst"     (found under <TIBRV_HOME>/bin directory) on all the     affected machines (daemons/clients) and capture the output.

2. Run tibrvlisten     on the subject "_RV.>" for approximately 20 minutes.

3. Please send a     raw packet capture using rvtrace or tcpdump sniffer tools:

- Using rvtrace     (located in <TIBRV>/bin directory): 'rvtrace -w     <capture file name>'. Note: without any license, rvtrace     will stop capturing packets after 10 minute.

- Using tcpdump:     'tcpdump -s 250 -w <output_file_name>'

Documentation on     rvtrace can be found in Chapter 12 of the RV Administration     manual. Please review the sections on "Limitations" and     "Performance Effects" before running rvtrace.

4. Run "netstat -s"     twice, before and after running rvtrace, and submit the output     for review.

5. Monitor and     submit details of the Memory/CPU usage of the affected     hosts