[jira] [Created] (IGNITE-13358) Improvements for partition clearing related parts

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Created] (IGNITE-13358) Improvements for partition clearing related parts

Anton Vinogradov (Jira)
Alexey Scherbakov created IGNITE-13358:

             Summary: Improvements for partition clearing related parts
                 Key: IGNITE-13358
                 URL: https://issues.apache.org/jira/browse/IGNITE-13358
             Project: Ignite
          Issue Type: Improvement
            Reporter: Alexey Scherbakov
            Assignee: Alexey Scherbakov

We have several issues related to a partition clearing worth fixing.

1. PartitionsEvictManager doent's provide obvious guarantees for a correctness when a node or a cache group is stopped while partitions are concurrently clearing.

2. GridDhtLocalPartition#awaitDestroy is called while holding topology write lock, which is deadlock prone, because we currently require write lock to destroy a partition.

3. GridDhtLocalPartition contains a lot of messy code related to partition clearing, most notably ClearFuture, but the clearing is done by PartitionsEvictManager. We should get rid of a clearing code in GridDhtLocalPartition. This should also bring better code readility and help understand what happening during a clearing.

4. Currently moving partitions are cleared before rebalancing in the order different to rebalanceOrder, breaking the contract.

5. The clearing logic for for moving partitions (before rebalancing) seems incorrect: it's possible to lost updates received during clearing.

6. To clear partitions before full rebalancing we utilize same threads as for a partition eviction. This can slow rebalancing even if we have resources. Better to clear partitions in the rebalance pool (explicitely dedicated by user).

7. It's possible to reserve a renting partition, which have absolutely no meaning. All operations with a renting partitions (except clearing) are a waste of resources.

This message was sent by Atlassian Jira