[jira] [Created] (IGNITE-10899) Service Grid: disconnecting during node stop may lead to deadlock

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (IGNITE-10899) Service Grid: disconnecting during node stop may lead to deadlock

JIRA jira@apache.org
Vyacheslav Daradur created IGNITE-10899:
-------------------------------------------

             Summary: Service Grid: disconnecting during node stop may lead to deadlock
                 Key: IGNITE-10899
                 URL: https://issues.apache.org/jira/browse/IGNITE-10899
             Project: Ignite
          Issue Type: Task
          Components: managed services
    Affects Versions: 2.7
            Reporter: Vyacheslav Daradur
            Assignee: Vyacheslav Daradur
             Fix For: 2.8


In a rare case, when {{onDisconneced}} may be called during node stopping deadlock may occur because of  {{ServiceDeploymentManage#stopProcessong}} blocks busyLock and not release it intended.

The issue found on TeamCity Zookeeper suite with the following stack trace:
{CODE}
disco-notifier-worker-#569118%client4%" 
 #609288
 prio=5 os_prio=0 tid=0x00007f905b440800 nid=0x3f6fbd sleeping[0x00007f9383efd000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at org.apache.ignite.internal.util.GridSpinReadWriteLock.writeLock(GridSpinReadWriteLock.java:204)
at org.apache.ignite.internal.util.GridSpinBusyLock.block(GridSpinBusyLock.java:76)
at org.apache.ignite.internal.processors.service.ServiceDeploymentManager.stopProcessing(ServiceDeploymentManager.java:137)
at org.apache.ignite.internal.processors.service.IgniteServiceProcessor.stopProcessor(IgniteServiceProcessor.java:261)
at org.apache.ignite.internal.processors.service.IgniteServiceProcessor.onDisconnected(IgniteServiceProcessor.java:429)
at org.apache.ignite.internal.IgniteKernal.onDisconnected(IgniteKernal.java:4010)
at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:819)
at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.lambda$onDiscovery$0(GridDiscoveryManager.java:602)
 - locked <0x00000000f7ecdfa0> (a java.lang.Object)
at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4$$Lambda$25/2087171109.run(Unknown Source)
at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body0(GridDiscoveryManager.java:2696)
at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body(GridDiscoveryManager.java:2734)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.java:748)
{CODE}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)