11 Jan 2011

Sync your machine clocks!


When running ActiveMQ with producer and consumers spread across multiple machines, make sure to have the clocks synced on these machines!
Otherwise there might be interesting side effects when using JMS expiration times.

1) Consider the following simple scenario:
A broker running on host A with the local time 1.35 pm.
Secondly a producer/consumer pair running on host B with the local time of 1.30 pm.

In summary:
Broker time: 1.35 pm
JMS client time: 1.30 pm


The producer sends a message with a JMSExpiration time of 2 mins at 1.30 pm sharp. So the message expires at 1.32 pm. The broker receives the message, checks the expiration time and realizes the message is already expired. So it gets moved to DLQ immediately. The consumer will not get the message! This might be particularly surprising if the consumer is on the same machine as the producer and you start to wonder where your message is.
Note: The JMSExpiration time that is set on the message does not contain the value 2mins, but the actual time in future when the message expires (represented as a long). This value is computed using the local time of the message producer and compared against the local time of the broker before being put on the queue.

2) Now let's consider the opposite example:
In summary
Broker time: 1.30 pm
JMS client time: 1.35 pm

The broker's local time is 1.30 pm and the producers/consumers local time is 1.35 pm.
The Producer again sends a message with a JMSExpiration of 2 mins. It is received by the broker at 1.30 with an expiration time at 1.37pm. The message gets put onto the queue, from where the consumer can grab it. So all is fine in this scenario.

It is therefore highly suggested that the local times between all parties that participate in messaging are more or less synchronized. One simple option is to configure for NTP synchronization on each machine. There are many NTP tools available for all major operating systems.

If for whatever reason you cannot synchronize the times between machines (e.g. broker running externally), then I suggest to use a JMSExpiration time that include the delta of the time difference between machines. E.g. if the delta is known to be 5 mins, then perhaps set the JMSExpiration time to 5+ mins (adding enough time for message processing and delivery).
On the other hand if message expiration is an important requirement in your application, you should really try to synchronize times between all involved machines.

Part 2


This brings me to the second part of this post, the ActiveMQ TimeStampingBrokerPlugin.

From its documentation:

"This can be useful when the clocks on client machines are known to not be correct and you can only trust the time set on the broker machines."


1) Let's revisit the first scenario again:
Broker time: 1.35 pm
JMS client time: 1.30 pm

The plug-in can help you in this case. Because the plug-in will not only set the JMS message timestamp to the current time at the broker but also recalculate the resulting JMS expiration time again based on the broker's local time. Thus the message will not be marked as expired when it is handled by the broker. It is therefore put onto the queue from where the consumer can grab it. So the use of the TimeStampingBrokerPlugin can help to resolve the problem of scenario 1).


2) Now let's consider the second scenario again:
Broker time: 1.30 pm
JMS client time: 1.35 pm

The JMS producer sends the message at 1.35 pm local time with an expiration time set to 1.37 pm.
The TimeStampingBrokerPlugin resets the expiration time to 1.32 pm (2 mins based on the brokers local time). The message is put onto the queue.
The consumer that is connected has a local time of 1.35 pm. It will not grab the message!!
Why? Because from this consumer's point of view the message has already expired.

Such situation will generally be difficult to understand when looking at the system using either the ActiveMQ web console or JMX console. There is a message on the queue and there is a consumer connected but the message is not consumed!
You might not immediately think about JMS expiration times and different machine times. You will more likely start to think the attached consumer is hung or there is a bug in ActiveMQ.
If you configure for ActiveMQ debug logging in the consumer, you will notice that the consumer actually gets the message from the queue but it will discard it due to its expiration time. Under debug logging the following is printed:


ActiveMQMessageConsumer DEBUG ID:nbwfhtmielke-4668-1294676704879-2:0:1:1 received expired message: MessageDispatch {commandId = 0, … expiration = 1294675812454,
timestamp = 1294674812454, …}



There is a possible solution though, that is to use the plug-in configuration property futureOnly="true". If set to true the plug-in will not set the new expiration time on the message if it is lower than the original expiration time. It will therefore never reset the expiration time to a lower value. Instead the original expiration time gets preserved. That way the remote consumer will grab the message.


Note: You could also run into this problem with a camel-jms route using INOUT message exchange pattern and connecting to an external broker that has this plug-in configured.
camel-jms uses requestTimeout=20 secs by default. That generates a JMSExpiration message header with an expiration time of 20 secs. If the broker's local time is only 30 seconds (or even less) behind the local time of the JMS consumer, the same issue of the message not getting consumed might occur.

Conclusion: If somehow possible, sync the machine clocks on all machines that are involved in the message exchange. If that is not possible, check the time differences and recalculate your JMS expiration times.

8 comments:

VIKRAM said...

Hi,
This post helped me to figure out the cause of a problem which I am facing from last 5-6 days. Thanks for such a nice and helpful post. One more query - Is there any example of TimeStampingBrokerPlugin available to understand the process to apply this in my functionality?

Thanks in advance.

Vikram,

Lucas Hills said...

Thanks Torsten, nice article. Though even with the timestampplugin, even if you do set futureOnly=true, using your second example, it'd still take your example message 7 minutes to expire and that's only when your 2 clock times are relatively close. What happens when you're trying to run services that are located in another timezone? Say a full day behind and syncing the clocks on both machines is the wrong thing to do?
Wouldn't it be so much cleaner and easier to implement if you could just send along a header in a message that said how long a message should live for based on the time that it arrives at the broker?

Torsten Mielke said...

Wouldn't it be so much cleaner and easier to implement if you could just send along a header in a message that said how long a message should live for based on the time that it arrives at the broker?

That may be difficult. When the producer creates a msg with an expiry time based on its own time stamp. So if the producer thinks a msgs should be expired in 3 mins then it assumes that every broker will honor this time and not decide differently based on its own local time. The outcome could be brokers dropping msgs and you may not realize why.

Lucas Hills said...

Hi Torsten, sorry mate, you've misunderstood what I was trying to say.
I meant, it would be nice if the producer could just send along a long value, say 180000 (not a timestamp based on the producers machine) and then when the broker receives it, it just assigns it an expiry time of brokerTime+180000, and then if no consumers consume it within these 3 minutes, then ActiveMQ just cleans the message off the queue. It makes it very complicated when you have to sync up the times on the computers that are running the producers, consumers and the broker itself.

Sorry I'm just having a rant.. I just wish it was configurable for me to be able to say: ActiveMQ is master and producers and consumers don't care what time is being used.

Maddy said...

Thanks for the article, it helped us to understand the issue that i am facing. But if theproducer time is different from the consumer time. Lets assume that the producer time is 11.35
Broker/consumer is in sync 11.30. If there is any response expected from consumer, it would send a response message with timestamp 1.30, but as producer time is 1.35 does it receive message( currently it is not receiving the message and we are thinking about the solution). Please could you help us

Maddy said...

Sorry there is a mistake, 1.30 and 1.35 refer to 11.30 and 11.35. Thanks in advance!

ntsggr said...

Hi,

does the plugin also works in this scenario:

Broker time: 1.30 pm
JMS client producer: 1.30 pm
JMS client consumer: 1.35 pm

TTL: 2min.
-----------------------------

I guess the producer sends the message successful to the broker and the broker provide the message for the consumer with a expiry time of 1.32
--> The consumer is already on 1.35 and therefor it doesn't grab the message.

Am I wrong or is this still a problem, that a time difference only of the consumer is a problem.

Thanks in advance.

G√ľnther Grill

Fred Moore said...

Hi,

in case you are using JMS API on the consumer side AND the consumer clock if ahead of the broker clock by more the the time-to-live of the message...

...messages will be unduly ignored by the consumer! Until AMQ-5406 is solved, that is.

This is explained in detail here: https://issues.apache.org/jira/browse/AMQ-5406

Possible workarounds: adopt longer time-to-live values to mitigate the issue or adopt a non-JMS ActiveMQ API (e.g. STOMP).