Cluster auto activation design proposal

classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Cluster auto activation design proposal

Sergey Chugunov
Hello Ignite developers,

I would like to start a discussion about design of important improvement
enabling automatic activation of cluster with durable store turned on [1].
Also it will help us to solve an issue with data divergence (e.g. this may
happen when half of the cluster goes down and updates are applied to
another half, and than online and offline parts of the cluster switch).

The idea is to introduce a *BaselineTopology *concept. Simplifying it is
just a collection of nodes that are expected to be in the cluster.

User establishes BaselineTopology (BT) on a cluster of desired
configuration (I mean here number of nodes in the first place), after that
this topology is persisted.

Once established BT represents a "frozen state" of topology which means
that affinity function uses it instead of actual topology. As a result no
rebalancing can happen until BT is reestablished.

Having BT established it is easy to implement automatic activation: when
nodes of starting cluster join it one by one, a special listener may
trigger cluster activation when composition of nodes matches with the one
described by BaselineTopology.

API for BaselineTopology manipulation may look like this:

*Ignite::activation::establishBaselineTopology();*
*Ignite::activation::establishBaselineTopology(BaselineTopology bltTop);*

Both methods will establish BT and activate cluster once it is established.

The first one allows user to establish BT using current topology. If any
changes happen to the topology during establishing process, user will be
notified and allowed to proceed or abort the procedure.

Second method allows to use some monitoring'n'management tools like
WebConsole where user can prepare a list of nodes, using them create a BT
and send to the cluster a command to finally establish it.

From high level BaselineTopology entity contains only collection of nodes:

*BaselineTopology {*
*  Collection<TopologyNode> nodes;*
*}*

*TopologyNode* here contains information about node - its consistent id and
set of user attributes used to calculate affinity function.

In order to support data divergence prevention some kind of versioning must
be added to BT entity to refuse joining new node but we can clarify it
later.

Please provide your feedback/thoughts and ask any questions about suggested
improvement.

Thanks,
Sergey.

[1] https://issues.apache.org/jira/browse/IGNITE-5851
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Cluster auto activation design proposal

Denis Magda-2
Sergey,

Is it assumed that the baseline topology can be updated in runtime? Like, initially I had a cluster of 10 nodes but in a couple of weeks it was expanded to 15 nodes. How the baseline topology should be updated in this way? Will it happen automatically?


Denis

> On Aug 1, 2017, at 7:51 AM, Sergey Chugunov <[hidden email]> wrote:
>
> Hello Ignite developers,
>
> I would like to start a discussion about design of important improvement
> enabling automatic activation of cluster with durable store turned on [1].
> Also it will help us to solve an issue with data divergence (e.g. this may
> happen when half of the cluster goes down and updates are applied to
> another half, and than online and offline parts of the cluster switch).
>
> The idea is to introduce a *BaselineTopology *concept. Simplifying it is
> just a collection of nodes that are expected to be in the cluster.
>
> User establishes BaselineTopology (BT) on a cluster of desired
> configuration (I mean here number of nodes in the first place), after that
> this topology is persisted.
>
> Once established BT represents a "frozen state" of topology which means
> that affinity function uses it instead of actual topology. As a result no
> rebalancing can happen until BT is reestablished.
>
> Having BT established it is easy to implement automatic activation: when
> nodes of starting cluster join it one by one, a special listener may
> trigger cluster activation when composition of nodes matches with the one
> described by BaselineTopology.
>
> API for BaselineTopology manipulation may look like this:
>
> *Ignite::activation::establishBaselineTopology();*
> *Ignite::activation::establishBaselineTopology(BaselineTopology bltTop);*
>
> Both methods will establish BT and activate cluster once it is established.
>
> The first one allows user to establish BT using current topology. If any
> changes happen to the topology during establishing process, user will be
> notified and allowed to proceed or abort the procedure.
>
> Second method allows to use some monitoring'n'management tools like
> WebConsole where user can prepare a list of nodes, using them create a BT
> and send to the cluster a command to finally establish it.
>
> From high level BaselineTopology entity contains only collection of nodes:
>
> *BaselineTopology {*
> *  Collection<TopologyNode> nodes;*
> *}*
>
> *TopologyNode* here contains information about node - its consistent id and
> set of user attributes used to calculate affinity function.
>
> In order to support data divergence prevention some kind of versioning must
> be added to BT entity to refuse joining new node but we can clarify it
> later.
>
> Please provide your feedback/thoughts and ask any questions about suggested
> improvement.
>
> Thanks,
> Sergey.
>
> [1] https://issues.apache.org/jira/browse/IGNITE-5851

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Cluster auto activation design proposal

Alexey Goncharuk
I think we should be able to change the BT in the runtime, and a user
should have several ways to do this:

 * programmatically via the API suggested by Sergey
 * Using management tools (console visor on Web Console)
 * Based on some sort of policies when the actual cluster topology differs
too much from the baseline or when some critical condition happens (e.g.,
when there are no more backups for a partition)



2017-08-01 22:21 GMT+03:00 Denis Magda <[hidden email]>:

> Sergey,
>
> Is it assumed that the baseline topology can be updated in runtime? Like,
> initially I had a cluster of 10 nodes but in a couple of weeks it was
> expanded to 15 nodes. How the baseline topology should be updated in this
> way? Will it happen automatically?
>
> —
> Denis
>
> > On Aug 1, 2017, at 7:51 AM, Sergey Chugunov <[hidden email]>
> wrote:
> >
> > Hello Ignite developers,
> >
> > I would like to start a discussion about design of important improvement
> > enabling automatic activation of cluster with durable store turned on
> [1].
> > Also it will help us to solve an issue with data divergence (e.g. this
> may
> > happen when half of the cluster goes down and updates are applied to
> > another half, and than online and offline parts of the cluster switch).
> >
> > The idea is to introduce a *BaselineTopology *concept. Simplifying it is
> > just a collection of nodes that are expected to be in the cluster.
> >
> > User establishes BaselineTopology (BT) on a cluster of desired
> > configuration (I mean here number of nodes in the first place), after
> that
> > this topology is persisted.
> >
> > Once established BT represents a "frozen state" of topology which means
> > that affinity function uses it instead of actual topology. As a result no
> > rebalancing can happen until BT is reestablished.
> >
> > Having BT established it is easy to implement automatic activation: when
> > nodes of starting cluster join it one by one, a special listener may
> > trigger cluster activation when composition of nodes matches with the one
> > described by BaselineTopology.
> >
> > API for BaselineTopology manipulation may look like this:
> >
> > *Ignite::activation::establishBaselineTopology();*
> > *Ignite::activation::establishBaselineTopology(BaselineTopology
> bltTop);*
> >
> > Both methods will establish BT and activate cluster once it is
> established.
> >
> > The first one allows user to establish BT using current topology. If any
> > changes happen to the topology during establishing process, user will be
> > notified and allowed to proceed or abort the procedure.
> >
> > Second method allows to use some monitoring'n'management tools like
> > WebConsole where user can prepare a list of nodes, using them create a BT
> > and send to the cluster a command to finally establish it.
> >
> > From high level BaselineTopology entity contains only collection of
> nodes:
> >
> > *BaselineTopology {*
> > *  Collection<TopologyNode> nodes;*
> > *}*
> >
> > *TopologyNode* here contains information about node - its consistent id
> and
> > set of user attributes used to calculate affinity function.
> >
> > In order to support data divergence prevention some kind of versioning
> must
> > be added to BT entity to refuse joining new node but we can clarify it
> > later.
> >
> > Please provide your feedback/thoughts and ask any questions about
> suggested
> > improvement.
> >
> > Thanks,
> > Sergey.
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-5851
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Cluster auto activation design proposal

dsetrakyan
First of all, this sounds a bit too scientific and will likely be
misunderstood and misused. Is there a chance to give it a better name and a
simpler use case and description?

Secondly, I do agree that it should be mandatory to update the BT. For
example, what if I want to add several nodes to the cluster for scalability
reasons? As Alexey G. suggested, perhaps we can have some policies that
will update the baseline topology automatically and will properly and
clearly report it to the user in logs.

D.

On Wed, Aug 2, 2017 at 9:51 AM, Alexey Goncharuk <[hidden email]
> wrote:

> I think we should be able to change the BT in the runtime, and a user
> should have several ways to do this:
>
>  * programmatically via the API suggested by Sergey
>  * Using management tools (console visor on Web Console)
>  * Based on some sort of policies when the actual cluster topology differs
> too much from the baseline or when some critical condition happens (e.g.,
> when there are no more backups for a partition)
>
>
>
> 2017-08-01 22:21 GMT+03:00 Denis Magda <[hidden email]>:
>
> > Sergey,
> >
> > Is it assumed that the baseline topology can be updated in runtime? Like,
> > initially I had a cluster of 10 nodes but in a couple of weeks it was
> > expanded to 15 nodes. How the baseline topology should be updated in this
> > way? Will it happen automatically?
> >
> > —
> > Denis
> >
> > > On Aug 1, 2017, at 7:51 AM, Sergey Chugunov <[hidden email]
> >
> > wrote:
> > >
> > > Hello Ignite developers,
> > >
> > > I would like to start a discussion about design of important
> improvement
> > > enabling automatic activation of cluster with durable store turned on
> > [1].
> > > Also it will help us to solve an issue with data divergence (e.g. this
> > may
> > > happen when half of the cluster goes down and updates are applied to
> > > another half, and than online and offline parts of the cluster switch).
> > >
> > > The idea is to introduce a *BaselineTopology *concept. Simplifying it
> is
> > > just a collection of nodes that are expected to be in the cluster.
> > >
> > > User establishes BaselineTopology (BT) on a cluster of desired
> > > configuration (I mean here number of nodes in the first place), after
> > that
> > > this topology is persisted.
> > >
> > > Once established BT represents a "frozen state" of topology which means
> > > that affinity function uses it instead of actual topology. As a result
> no
> > > rebalancing can happen until BT is reestablished.
> > >
> > > Having BT established it is easy to implement automatic activation:
> when
> > > nodes of starting cluster join it one by one, a special listener may
> > > trigger cluster activation when composition of nodes matches with the
> one
> > > described by BaselineTopology.
> > >
> > > API for BaselineTopology manipulation may look like this:
> > >
> > > *Ignite::activation::establishBaselineTopology();*
> > > *Ignite::activation::establishBaselineTopology(BaselineTopology
> > bltTop);*
> > >
> > > Both methods will establish BT and activate cluster once it is
> > established.
> > >
> > > The first one allows user to establish BT using current topology. If
> any
> > > changes happen to the topology during establishing process, user will
> be
> > > notified and allowed to proceed or abort the procedure.
> > >
> > > Second method allows to use some monitoring'n'management tools like
> > > WebConsole where user can prepare a list of nodes, using them create a
> BT
> > > and send to the cluster a command to finally establish it.
> > >
> > > From high level BaselineTopology entity contains only collection of
> > nodes:
> > >
> > > *BaselineTopology {*
> > > *  Collection<TopologyNode> nodes;*
> > > *}*
> > >
> > > *TopologyNode* here contains information about node - its consistent id
> > and
> > > set of user attributes used to calculate affinity function.
> > >
> > > In order to support data divergence prevention some kind of versioning
> > must
> > > be added to BT entity to refuse joining new node but we can clarify it
> > > later.
> > >
> > > Please provide your feedback/thoughts and ask any questions about
> > suggested
> > > improvement.
> > >
> > > Thanks,
> > > Sergey.
> > >
> > > [1] https://issues.apache.org/jira/browse/IGNITE-5851
> >
> >
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Cluster auto activation design proposal

yzhdanov
In reply to this post by Alexey Goncharuk
> * Based on some sort of policies when the actual cluster topology differs
too much from the baseline or when some critical condition happens (e.g.,
when there are no more backups for a partition)

Good point, Alex! I would even go further. If cluster is active and under
load and nodes continue joining and leaving then we can have several BT's
that are possible to restart on - the main condition is to have all the up
to date data partitions. I.e. if you have 4 servers and 3 backups most
probably you can have all the data with 2, 3 and, of course, 4 nodes. Makes
sense?

I would also think of different name. Topology (for me) also implies the
version, but here only nodes carrying data are important. How about
"restart nodes set"?

--Yakov
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Cluster auto activation design proposal

dsetrakyan
How about naming it "minimal node set" or "required node set"?

⁣D.​

On Aug 3, 2017, 11:15 AM, at 11:15 AM, Yakov Zhdanov <[hidden email]> wrote:

>> * Based on some sort of policies when the actual cluster topology
>differs
>too much from the baseline or when some critical condition happens
>(e.g.,
>when there are no more backups for a partition)
>
>Good point, Alex! I would even go further. If cluster is active and
>under
>load and nodes continue joining and leaving then we can have several
>BT's
>that are possible to restart on - the main condition is to have all the
>up
>to date data partitions. I.e. if you have 4 servers and 3 backups most
>probably you can have all the data with 2, 3 and, of course, 4 nodes.
>Makes
>sense?
>
>I would also think of different name. Topology (for me) also implies
>the
>version, but here only nodes carrying data are important. How about
>"restart nodes set"?
>
>--Yakov
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Cluster auto activation design proposal

yzhdanov
Ю> How about naming it "minimal node set" or "required node set"?

Required for what? I would add restart if there are no confusion.

--Yakov
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Cluster auto activation design proposal

Sergey Chugunov
From my standpoint name for the concept should emphasize that nodes from
the set constitute a target topology - the place where user wants to be.

If we go in a "node set" way, what about FixedNodeSet or BaseNodeSet?

"restart node set" also is a bit confusing because this concept works not
only to restart but to manage adding and removing nodes to/from cluster.

E.g. cluster admin decides to add ten more nodes to existing cluster:
he/she starts them one by one, nodes join the cluster but don't receive any
data as they are not in FixedNodeSet yet.
Then admin issues "change fixed node set" command or adds them to the set
in some other way and nodes become operational.
As one can see, no restarts are involved in the process.

Thanks,
Sergey.

On Thu, Aug 3, 2017 at 12:23 PM, Yakov Zhdanov <[hidden email]> wrote:

> Ю> How about naming it "minimal node set" or "required node set"?
>
> Required for what? I would add restart if there are no confusion.
>
> --Yakov
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Cluster auto activation design proposal

dsetrakyan
In reply to this post by yzhdanov
Yakov,

I think it is not just restarts, this set of nodes is minimally required for the cluster to function, no?

⁣D.​

On Aug 3, 2017, 11:23 AM, at 11:23 AM, Yakov Zhdanov <[hidden email]> wrote:
>Ю> How about naming it "minimal node set" or "required node set"?
>
>Required for what? I would add restart if there are no confusion.
>
>--Yakov
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Cluster auto activation design proposal

Sergey Chugunov
Dmitriy,

Obvious connotation of "minimal set" is a set that cannot be decreased.

But lets consider the following case: user has a cluster of 50 nodes and
decides to switch off 3 nodes for maintenance for a while. Ok, user just
does it and then recreates this "minimal node set" to only 47 nodes.

So initial minimal node set was decreased - something counter-intuitive to
me and may cause confusion as well.


On Thu, Aug 3, 2017 at 12:37 PM, <[hidden email]> wrote:

> Yakov,
>
> I think it is not just restarts, this set of nodes is minimally required
> for the cluster to function, no?
>
> ⁣D.​
>
> On Aug 3, 2017, 11:23 AM, at 11:23 AM, Yakov Zhdanov <[hidden email]>
> wrote:
> >Ю> How about naming it "minimal node set" or "required node set"?
> >
> >Required for what? I would add restart if there are no confusion.
> >
> >--Yakov
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Cluster auto activation design proposal

Dmitry Pavlov
Igniters, what about Target Node Set? Complete Node Set?

As when we reach this topology, we can activate cluster.

чт, 3 авг. 2017 г. в 12:58, Sergey Chugunov <[hidden email]>:

> Dmitriy,
>
> Obvious connotation of "minimal set" is a set that cannot be decreased.
>
> But lets consider the following case: user has a cluster of 50 nodes and
> decides to switch off 3 nodes for maintenance for a while. Ok, user just
> does it and then recreates this "minimal node set" to only 47 nodes.
>
> So initial minimal node set was decreased - something counter-intuitive to
> me and may cause confusion as well.
>
>
> On Thu, Aug 3, 2017 at 12:37 PM, <[hidden email]> wrote:
>
> > Yakov,
> >
> > I think it is not just restarts, this set of nodes is minimally required
> > for the cluster to function, no?
> >
> > ⁣D.​
> >
> > On Aug 3, 2017, 11:23 AM, at 11:23 AM, Yakov Zhdanov <
> [hidden email]>
> > wrote:
> > >Ю> How about naming it "minimal node set" or "required node set"?
> > >
> > >Required for what? I would add restart if there are no confusion.
> > >
> > >--Yakov
> >
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Cluster auto activation design proposal

yzhdanov
In reply to this post by dsetrakyan
>I think it is not just restarts, this set of nodes is minimally required
for the cluster to function, no?

I don't think so. Cluster can function if there is no data loss.

--Yakov
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Cluster auto activation design proposal

yzhdanov
In reply to this post by Sergey Chugunov
>Obvious connotation of "minimal set" is a set that cannot be decreased.

>But lets consider the following case: user has a cluster of 50 nodes and
>decides to switch off 3 nodes for maintenance for a while. Ok, user just
>does it and then recreates this "minimal node set" to only 47 nodes.

>So initial minimal node set was decreased - something counter-intuitive to
>me and may cause confusion as well.

That was my point. If I have 50 nodes and 3 backups I can restart on 48, 49
and 50 without data loss. In case of 48 and 49 after cluster gets activated
missing backups are assigned and rebalancing starts.

--Yakov
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Cluster auto activation design proposal

Alexey Goncharuk
My understanding of Baseline Topology is the set of nodes which are
*expected* to be in the cluster.
Let me go a little bit further because BT (or whatever name we choose) may
and will solve more issues than just auto-activation:

1) More graceful control over rebalancing than just rebalance delay. If a
server is shut down for maintenance and there are enough backup nodes in
the cluster, there is no need to rebalance.
2) Guarantee that there will be no conflicting key-value mappings due to
incorrect cluster activation. For example, consider a scenario when there
was a cluster of 10 nodes, then the cluster was shut down, started first 5
nodes, activated, made some updates, shut down 5 nodes, start up other 5
nodes, activate, make some updates, start up first 5 nodes. Currently,
there is no way to determine that there was an incompatible topology change
which leads to data inconsistency.
3) When a cluster is shutting down node-by-node, we must track a node which
has 'seen' a partition last time and not activate the cluster until all
nodes are present. Otherwise, again, we may activate too early and see
outdated values.

I do not want to add any 'faster' hacks here because they will only make
the issue above appear more likely. Besides, BT should be available in 2.2
anyway, so no need to rush with hacks.

--AG

2017-08-03 15:09 GMT+03:00 Yakov Zhdanov <[hidden email]>:

> >Obvious connotation of "minimal set" is a set that cannot be decreased.
>
> >But lets consider the following case: user has a cluster of 50 nodes and
> >decides to switch off 3 nodes for maintenance for a while. Ok, user just
> >does it and then recreates this "minimal node set" to only 47 nodes.
>
> >So initial minimal node set was decreased - something counter-intuitive to
> >me and may cause confusion as well.
>
> That was my point. If I have 50 nodes and 3 backups I can restart on 48, 49
> and 50 without data loss. In case of 48 and 49 after cluster gets activated
> missing backups are assigned and rebalancing starts.
>
> --Yakov
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Cluster auto activation design proposal

Sergey Chugunov
I also would like to provide more use cases of how BLT is supposed to work
(let me call it this way until we come up with a better one):

   1. User creates new BLT using WebConsole or other tool and "applies" it
   to brand-new cluster.

   2. User starts up brand-new cluster with desired amount of nodes and
   activates it. At the moment of activation BLT is created with all server
   non-daemon nodes presented in the cluster.

   3. User starts up a cluster with previously prepared BLT -> when set of
   nodes in the cluster matches with BLT cluster gets automatically activated.

   4. User has an up-and-running active cluster and starts a few more
   nodes. They join the cluster but no partitions are assigned to them.
   User recreates BLT on new cluster topology -> partitions are assigned to
   new nodes.

   5. User takes out nodes from cluster (e.g. for maintenance purposes): no
   rebalance happens until user recreates BLT on new cluster topology.

   6. If some parameters reach critical levels (e.g. number of backups for
   a partition is too low) coordinator automatically recreates BLT and thus
   triggers rebalancing.


I hope these use cases will help to clarify purposes of the proposed
feature.

On Thu, Aug 3, 2017 at 4:08 PM, Alexey Goncharuk <[hidden email]
> wrote:

> My understanding of Baseline Topology is the set of nodes which are
> *expected* to be in the cluster.
> Let me go a little bit further because BT (or whatever name we choose) may
> and will solve more issues than just auto-activation:
>
> 1) More graceful control over rebalancing than just rebalance delay. If a
> server is shut down for maintenance and there are enough backup nodes in
> the cluster, there is no need to rebalance.
> 2) Guarantee that there will be no conflicting key-value mappings due to
> incorrect cluster activation. For example, consider a scenario when there
> was a cluster of 10 nodes, then the cluster was shut down, started first 5
> nodes, activated, made some updates, shut down 5 nodes, start up other 5
> nodes, activate, make some updates, start up first 5 nodes. Currently,
> there is no way to determine that there was an incompatible topology change
> which leads to data inconsistency.
> 3) When a cluster is shutting down node-by-node, we must track a node which
> has 'seen' a partition last time and not activate the cluster until all
> nodes are present. Otherwise, again, we may activate too early and see
> outdated values.
>
> I do not want to add any 'faster' hacks here because they will only make
> the issue above appear more likely. Besides, BT should be available in 2.2
> anyway, so no need to rush with hacks.
>
> --AG
>
> 2017-08-03 15:09 GMT+03:00 Yakov Zhdanov <[hidden email]>:
>
> > >Obvious connotation of "minimal set" is a set that cannot be decreased.
> >
> > >But lets consider the following case: user has a cluster of 50 nodes and
> > >decides to switch off 3 nodes for maintenance for a while. Ok, user just
> > >does it and then recreates this "minimal node set" to only 47 nodes.
> >
> > >So initial minimal node set was decreased - something counter-intuitive
> to
> > >me and may cause confusion as well.
> >
> > That was my point. If I have 50 nodes and 3 backups I can restart on 48,
> 49
> > and 50 without data loss. In case of 48 and 49 after cluster gets
> activated
> > missing backups are assigned and rebalancing starts.
> >
> > --Yakov
> >
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Cluster auto activation design proposal

Alexey Kuznetsov
Hi,

>>1. User creates new BLT using WebConsole or other tool and "applies" it
 to brand-new cluster.

Good idea, but we also should implement *command-line utility* for the same
use case.

--
Alexey Kuznetsov
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Cluster auto activation design proposal

Sergey Chugunov
Folks,

I've summarized all results from our discussion so far on wiki page:
https://cwiki.apache.org/confluence/display/IGNITE/Automatic+activation+design+-+draft

I hope I reflected the most important details and going to add API
suggestions for all use cases soon.

Feel free to give feedback here or in comments under the page.

Thanks,
Sergey.

On Thu, Aug 3, 2017 at 5:40 PM, Alexey Kuznetsov <[hidden email]>
wrote:

> Hi,
>
> >>1. User creates new BLT using WebConsole or other tool and "applies" it
>  to brand-new cluster.
>
> Good idea, but we also should implement *command-line utility* for the same
> use case.
>
> --
> Alexey Kuznetsov
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Cluster auto activation design proposal

Denis Magda-2
Sergey,

That’s the only concern I have:

* 5. User takes out nodes from cluster (e.g. for maintenance purposes): no
  rebalance happens until user recreates BLT on new cluster topology.*

What if a node is crashed (or some other kind of outage) in the middle of the night and the user has to be sure that survived nodes will rearrange and rebalancing partitions?


Denis


> On Aug 4, 2017, at 9:21 AM, Sergey Chugunov <[hidden email]> wrote:
>
> Folks,
>
> I've summarized all results from our discussion so far on wiki page:
> https://cwiki.apache.org/confluence/display/IGNITE/Automatic+activation+design+-+draft
>
> I hope I reflected the most important details and going to add API
> suggestions for all use cases soon.
>
> Feel free to give feedback here or in comments under the page.
>
> Thanks,
> Sergey.
>
> On Thu, Aug 3, 2017 at 5:40 PM, Alexey Kuznetsov <[hidden email]>
> wrote:
>
>> Hi,
>>
>>>> 1. User creates new BLT using WebConsole or other tool and "applies" it
>> to brand-new cluster.
>>
>> Good idea, but we also should implement *command-line utility* for the same
>> use case.
>>
>> --
>> Alexey Kuznetsov
>>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Cluster auto activation design proposal

Alexey Goncharuk
Denis,

This should be handled by the BT triggers. If I have 3 backups configured,
I actually won't care if my cluster will live 6 hours without an additional
backup. If for a partition there is only one backup left - a new BT should
be triggered automatically.

2017-08-10 0:33 GMT+03:00 Denis Magda <[hidden email]>:

> Sergey,
>
> That’s the only concern I have:
>
> * 5. User takes out nodes from cluster (e.g. for maintenance purposes): no
>   rebalance happens until user recreates BLT on new cluster topology.*
>
> What if a node is crashed (or some other kind of outage) in the middle of
> the night and the user has to be sure that survived nodes will rearrange
> and rebalancing partitions?
>
> —
> Denis
>
>
> > On Aug 4, 2017, at 9:21 AM, Sergey Chugunov <[hidden email]>
> wrote:
> >
> > Folks,
> >
> > I've summarized all results from our discussion so far on wiki page:
> > https://cwiki.apache.org/confluence/display/IGNITE/
> Automatic+activation+design+-+draft
> >
> > I hope I reflected the most important details and going to add API
> > suggestions for all use cases soon.
> >
> > Feel free to give feedback here or in comments under the page.
> >
> > Thanks,
> > Sergey.
> >
> > On Thu, Aug 3, 2017 at 5:40 PM, Alexey Kuznetsov <[hidden email]>
> > wrote:
> >
> >> Hi,
> >>
> >>>> 1. User creates new BLT using WebConsole or other tool and "applies"
> it
> >> to brand-new cluster.
> >>
> >> Good idea, but we also should implement *command-line utility* for the
> same
> >> use case.
> >>
> >> --
> >> Alexey Kuznetsov
> >>
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Cluster auto activation design proposal

dsetrakyan
Can we brainstorm on the names again, I am not sure we have a consensus on
the name "baseline topology". This will be included in Ignite
configuration, so the name has to be clear.

Some of the proposals were:

- baseline topology
- minimal node set
- node restart set
- minimal topology

Any other suggestions?

D.

On Thu, Aug 10, 2017 at 2:13 AM, Alexey Goncharuk <
[hidden email]> wrote:

> Denis,
>
> This should be handled by the BT triggers. If I have 3 backups configured,
> I actually won't care if my cluster will live 6 hours without an additional
> backup. If for a partition there is only one backup left - a new BT should
> be triggered automatically.
>
> 2017-08-10 0:33 GMT+03:00 Denis Magda <[hidden email]>:
>
> > Sergey,
> >
> > That’s the only concern I have:
> >
> > * 5. User takes out nodes from cluster (e.g. for maintenance purposes):
> no
> >   rebalance happens until user recreates BLT on new cluster topology.*
> >
> > What if a node is crashed (or some other kind of outage) in the middle of
> > the night and the user has to be sure that survived nodes will rearrange
> > and rebalancing partitions?
> >
> > —
> > Denis
> >
> >
> > > On Aug 4, 2017, at 9:21 AM, Sergey Chugunov <[hidden email]
> >
> > wrote:
> > >
> > > Folks,
> > >
> > > I've summarized all results from our discussion so far on wiki page:
> > > https://cwiki.apache.org/confluence/display/IGNITE/
> > Automatic+activation+design+-+draft
> > >
> > > I hope I reflected the most important details and going to add API
> > > suggestions for all use cases soon.
> > >
> > > Feel free to give feedback here or in comments under the page.
> > >
> > > Thanks,
> > > Sergey.
> > >
> > > On Thu, Aug 3, 2017 at 5:40 PM, Alexey Kuznetsov <
> [hidden email]>
> > > wrote:
> > >
> > >> Hi,
> > >>
> > >>>> 1. User creates new BLT using WebConsole or other tool and "applies"
> > it
> > >> to brand-new cluster.
> > >>
> > >> Good idea, but we also should implement *command-line utility* for the
> > same
> > >> use case.
> > >>
> > >> --
> > >> Alexey Kuznetsov
> > >>
> >
> >
>
12
Loading...