[DISCUSSION] Cache warmup

classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSSION] Cache warmup

ткаленко кирилл
Now, after restarting node, we have only cold caches, which at first requests to them will gradually load data from disks, which can slow down first calls to them.
If node has more RAM than data on disk, then they can be loaded at start "warmup", thereby solving the issue of slowdowns during first calls to caches.

I suggest adding a warmup phase after recovery here [1] after [2], before descovery.

I suggest adding a new interface:

package org.apache.ignite.internal.processors.cache;

import org.apache.ignite.IgniteCheckedException;
import org.apache.ignite.internal.IgniteInternalFuture;
import org.jetbrains.annotations.Nullable;

/**
 * Interface for warming up cache.
 */
public interface CacheWarmup {
    /**
     * Warmup cache.
     *
     * @param cacheCtx Cache context.
     * @return Future cache warmup.
     * @throws IgniteCheckedException if failed.
     */
    @Nullable IgniteInternalFuture<?> process(GridCacheContext cacheCtx) throws IgniteCheckedException;
}

Which will allow to warm up caches in parallel and asynchronously. Warmup phase will end after all IgniteInternalFuture for all caches isDone.

Also adding the ability to customize via methods:
org.apache.ignite.configuration.IgniteConfiguration#setDefaultCacheWarmup
org.apache.ignite.configuration.CacheConfiguration#setCacheWarmup

Which will allow for each cache to set implementation of cache warming up, both for a specific cache, and for all if necessary.

I suggest adding an implementation of SequentialWarmup that will use [3].

Questions, suggestions, comments?

[1] - org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#afterLogicalUpdatesApplied
[2] - org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#restorePartitionStates
[3] - org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManager.CacheDataStore#preload
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Cache warmup

slava.koptilin
Hello Kirill,

Thanks a lot for driving this activity. If I am not mistaken, this
discussion relates to IEP-40.

> I suggest adding a warmup phase after recovery here [1] after [2], before
discovery.
This means that the user's thread, which starts Ignite via
Ignition.start(), will wait for ana additional step - cache warm-up.
I think this fact has to be clearly mentioned in our documentation (at
Javadocat least) because this step can be time-consuming.

> I suggest adding a new interface:
I would change it a bit. First of all, it would be nice to place this
interface to a public package and get rid of using GridCacheContext,
which is an internal class and it should not leak to the public API in any
case.
Perhaps, this parameter is not needed at all or we should add some public
abstraction instead of internal class.

package org.apache.ignite.configuration;

import org.apache.ignite.IgniteCheckedException;
import org.apache.ignite.lang.IgniteFuture;

public interface CacheWarmupper {
    /**
     * Warmup cache.
     *
     * @param cachename Cache name.
     * @return Future cache warmup.
     * @throws IgniteCheckedException If failed.
     */
    IgniteFuture<?> warmup(String cachename) throws IgniteCheckedException;
}

Thanks,
S.

пн, 27 июл. 2020 г. в 15:03, ткаленко кирилл <[hidden email]>:

> Now, after restarting node, we have only cold caches, which at first
> requests to them will gradually load data from disks, which can slow down
> first calls to them.
> If node has more RAM than data on disk, then they can be loaded at start
> "warmup", thereby solving the issue of slowdowns during first calls to
> caches.
>
> I suggest adding a warmup phase after recovery here [1] after [2], before
> descovery.
>
> I suggest adding a new interface:
>
> package org.apache.ignite.internal.processors.cache;
>
> import org.apache.ignite.IgniteCheckedException;
> import org.apache.ignite.internal.IgniteInternalFuture;
> import org.jetbrains.annotations.Nullable;
>
> /**
>  * Interface for warming up cache.
>  */
> public interface CacheWarmup {
>     /**
>      * Warmup cache.
>      *
>      * @param cacheCtx Cache context.
>      * @return Future cache warmup.
>      * @throws IgniteCheckedException if failed.
>      */
>     @Nullable IgniteInternalFuture<?> process(GridCacheContext cacheCtx)
> throws IgniteCheckedException;
> }
>
> Which will allow to warm up caches in parallel and asynchronously. Warmup
> phase will end after all IgniteInternalFuture for all caches isDone.
>
> Also adding the ability to customize via methods:
> org.apache.ignite.configuration.IgniteConfiguration#setDefaultCacheWarmup
> org.apache.ignite.configuration.CacheConfiguration#setCacheWarmup
>
> Which will allow for each cache to set implementation of cache warming up,
> both for a specific cache, and for all if necessary.
>
> I suggest adding an implementation of SequentialWarmup that will use [3].
>
> Questions, suggestions, comments?
>
> [1] -
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#afterLogicalUpdatesApplied
> [2] -
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#restorePartitionStates
> [3] -
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManager.CacheDataStore#preload
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Cache warmup

Zhenya Stanilovsky

Looks like we need additional func for static caches, for example: warmup(List<CacheConfiguration> cconf) it would be helpful for spring too.
 

>
>------- Forwarded message -------
>From: "Вячеслав Коптилин" < [hidden email] >
>To:  [hidden email]
>Cc:
>Subject: Re: [DISCUSSION] Cache warmup
>Date: Mon, 27 Jul 2020 16:47:48 +0300
>
>Hello Kirill,
>
>Thanks a lot for driving this activity. If I am not mistaken, this
>discussion relates to IEP-40.
>
>> I suggest adding a warmup phase after recovery here [1] after [2], before
>discovery.
>This means that the user's thread, which starts Ignite via
>Ignition.start(), will wait for ana additional step - cache warm-up.
>I think this fact has to be clearly mentioned in our documentation (at
>Javadocat least) because this step can be time-consuming.
>
>> I suggest adding a new interface:
>I would change it a bit. First of all, it would be nice to place this
>interface to a public package and get rid of using GridCacheContext,
>which is an internal class and it should not leak to the public API in any
>case.
>Perhaps, this parameter is not needed at all or we should add some public
>abstraction instead of internal class.
>
>package org.apache.ignite.configuration;
>
>import org.apache.ignite.IgniteCheckedException;
>import org.apache.ignite.lang.IgniteFuture;
>
>public interface CacheWarmupper {
>      /**
>       * Warmup cache.
>       *
>       * @param cachename Cache name.
>       * @return Future cache warmup.
>       * @throws IgniteCheckedException If failed.
>       */
>      IgniteFuture<?> warmup(String cachename) throws
>IgniteCheckedException;
>}
>
>Thanks,
>S.
>
>пн, 27 июл. 2020 г. в 15:03, ткаленко кирилл < [hidden email] >:
>
>> Now, after restarting node, we have only cold caches, which at first
>> requests to them will gradually load data from disks, which can slow down
>> first calls to them.
>> If node has more RAM than data on disk, then they can be loaded at start
>> "warmup", thereby solving the issue of slowdowns during first calls to
>> caches.
>>
>> I suggest adding a warmup phase after recovery here [1] after [2], before
>> descovery.
>>
>> I suggest adding a new interface:
>>
>> package org.apache.ignite.internal.processors.cache;
>>
>> import org.apache.ignite.IgniteCheckedException;
>> import org.apache.ignite.internal.IgniteInternalFuture;
>> import org.jetbrains.annotations.Nullable;
>>
>> /**
>> * Interface for warming up cache.
>> */
>> public interface CacheWarmup {
>> /**
>> * Warmup cache.
>> *
>> * @param cacheCtx Cache context.
>> * @return Future cache warmup.
>> * @throws IgniteCheckedException if failed.
>> */
>> @Nullable IgniteInternalFuture<?> process(GridCacheContext cacheCtx)
>> throws IgniteCheckedException;
>> }
>>
>> Which will allow to warm up caches in parallel and asynchronously. Warmup
>> phase will end after all IgniteInternalFuture for all caches isDone.
>>
>> Also adding the ability to customize via methods:
>> org.apache.ignite.configuration.IgniteConfiguration#setDefaultCacheWarmup
>> org.apache.ignite.configuration.CacheConfiguration#setCacheWarmup
>>
>> Which will allow for each cache to set implementation of cache warming
>> up,
>> both for a specific cache, and for all if necessary.
>>
>> I suggest adding an implementation of SequentialWarmup that will use [3].
>>
>> Questions, suggestions, comments?
>>
>> [1] -
>> org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#afterLogicalUpdatesApplied
>> [2] -
>> org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#restorePartitionStates
>> [3] -
>> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManager.CacheDataStore#preload
 
 
 
 
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Cache warmup

Denis Mekhanikov
Kirill,

That will be a great feature! Other popular databases already have it (e.g.
Postgres: https://www.postgresql.org/docs/11/pgprewarm.html), so it's good
that we're also going to have it in Ignite.

What implementation of CacheWarmup interface do you have in mind? Will
there be some preconfigured implementation, and will users be able to
implement it themselves?

Do you think it should be cache-based? I would say that a DataRegion-based
warm-up would come more naturally. Page IDs that are loaded into the data
region can be dumped periodically to disk and recovered on restarts. This
is more or less how it works in Postgres.
I'm afraid that if we make it cache-based, the implementation won't be that
obvious. We already have an API for warmup that appeared to be pretty much
impossible to apply in a useful way:
https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/IgniteCache.html#preloadPartition-int-
Let's make sure that our new tool for warming up is actually useful.

Denis

вт, 28 июл. 2020 г. в 09:17, Zhenya Stanilovsky <[hidden email]
>:

>
> Looks like we need additional func for static caches, for
> example: warmup(List<CacheConfiguration> cconf) it would be helpful for
> spring too.
>
> >
> >------- Forwarded message -------
> >From: "Вячеслав Коптилин" < [hidden email] >
> >To:  [hidden email]
> >Cc:
> >Subject: Re: [DISCUSSION] Cache warmup
> >Date: Mon, 27 Jul 2020 16:47:48 +0300
> >
> >Hello Kirill,
> >
> >Thanks a lot for driving this activity. If I am not mistaken, this
> >discussion relates to IEP-40.
> >
> >> I suggest adding a warmup phase after recovery here [1] after [2],
> before
> >discovery.
> >This means that the user's thread, which starts Ignite via
> >Ignition.start(), will wait for ana additional step - cache warm-up.
> >I think this fact has to be clearly mentioned in our documentation (at
> >Javadocat least) because this step can be time-consuming.
> >
> >> I suggest adding a new interface:
> >I would change it a bit. First of all, it would be nice to place this
> >interface to a public package and get rid of using GridCacheContext,
> >which is an internal class and it should not leak to the public API in any
> >case.
> >Perhaps, this parameter is not needed at all or we should add some public
> >abstraction instead of internal class.
> >
> >package org.apache.ignite.configuration;
> >
> >import org.apache.ignite.IgniteCheckedException;
> >import org.apache.ignite.lang.IgniteFuture;
> >
> >public interface CacheWarmupper {
> >      /**
> >       * Warmup cache.
> >       *
> >       * @param cachename Cache name.
> >       * @return Future cache warmup.
> >       * @throws IgniteCheckedException If failed.
> >       */
> >      IgniteFuture<?> warmup(String cachename) throws
> >IgniteCheckedException;
> >}
> >
> >Thanks,
> >S.
> >
> >пн, 27 июл. 2020 г. в 15:03, ткаленко кирилл < [hidden email] >:
> >
> >> Now, after restarting node, we have only cold caches, which at first
> >> requests to them will gradually load data from disks, which can slow
> down
> >> first calls to them.
> >> If node has more RAM than data on disk, then they can be loaded at start
> >> "warmup", thereby solving the issue of slowdowns during first calls to
> >> caches.
> >>
> >> I suggest adding a warmup phase after recovery here [1] after [2],
> before
> >> descovery.
> >>
> >> I suggest adding a new interface:
> >>
> >> package org.apache.ignite.internal.processors.cache;
> >>
> >> import org.apache.ignite.IgniteCheckedException;
> >> import org.apache.ignite.internal.IgniteInternalFuture;
> >> import org.jetbrains.annotations.Nullable;
> >>
> >> /**
> >> * Interface for warming up cache.
> >> */
> >> public interface CacheWarmup {
> >> /**
> >> * Warmup cache.
> >> *
> >> * @param cacheCtx Cache context.
> >> * @return Future cache warmup.
> >> * @throws IgniteCheckedException if failed.
> >> */
> >> @Nullable IgniteInternalFuture<?> process(GridCacheContext cacheCtx)
> >> throws IgniteCheckedException;
> >> }
> >>
> >> Which will allow to warm up caches in parallel and asynchronously.
> Warmup
> >> phase will end after all IgniteInternalFuture for all caches isDone.
> >>
> >> Also adding the ability to customize via methods:
> >>
> org.apache.ignite.configuration.IgniteConfiguration#setDefaultCacheWarmup
> >> org.apache.ignite.configuration.CacheConfiguration#setCacheWarmup
> >>
> >> Which will allow for each cache to set implementation of cache warming
> >> up,
> >> both for a specific cache, and for all if necessary.
> >>
> >> I suggest adding an implementation of SequentialWarmup that will use
> [3].
> >>
> >> Questions, suggestions, comments?
> >>
> >> [1] -
> >>
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#afterLogicalUpdatesApplied
> >> [2] -
> >>
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#restorePartitionStates
> >> [3] -
> >>
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManager.CacheDataStore#preload
>
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Cache warmup

ткаленко кирилл
In reply to this post by slava.koptilin
Hi, Slava!
 
Thank you for looking at the offer and making fair comments.
 
I personally discussed with Anton and Alexey because they are author and sponsor of "IEP-40" and we found out that point 2 in it is no longer relevant and it can be removed.
I suggest implementing point 3, since it may be independent of point 1. Also, the warm-up will always start after restore phase, without subscribing to events.
 
You are right this should be mentioned in the documentation and javadoc.
> This means that the user's thread, which starts Ignite via
> Ignition.start(), will wait for ana additional step - cache warm-up.
> I think this fact has to be clearly mentioned in our documentation (at
> Javadocat least) because this step can be time-consuming.
 
My suggestion for implementation:
1)Adding a marker interface "org.apache.ignite.configuration.WarmUpConfiguration" for configuring cache warming;
2)Set only one configuration via "org.apache.ignite.configuration.IgniteConfiguration#setWarmUpConfiguration";
3)Add an internal warm-up interface that will start in [1] after [2];
 
package org.apache.ignite.internal.processors.cache.warmup;
 
import org.apache.ignite.IgniteCheckedException;
import org.apache.ignite.configuration.WarmUpConfiguration;
import org.apache.ignite.internal.GridKernalContext;
 
/**
 * Interface for warming up.
 */
public interface WarmUpStrategy<T extends WarmUpConfiguration> {
    /**
     * Returns configuration class for mapping to strategy.
     *
     * @return Configuration class.
     */
    Class<T> configClass();
 
    /**
     * Warm up.
     *
     * @param kernalCtx Kernal context.
     * @param cfg       Warm-up configuration.
     * @throws IgniteCheckedException if faild.
     */
    void warmUp(GridKernalContext kernalCtx, T cfg) throws IgniteCheckedException;
}
 
4)Adding an internal plugin extension for add own strategies;
 
package org.apache.ignite.internal.processors.cache.warmup;
 
import java.util.Collection;
import org.apache.ignite.plugin.Extension;
 
/**
 * Interface for getting warm-up strategies from plugins.
 */
public interface WarmUpStrategySupplier extends Extension {
    /**
     * Getting warm-up strategies.
     *
     * @return Warm-up strategies.
     */
    Collection<WarmUpStrategy> strategies();
}
 
5)Add a "Load all" strategy that will load everything to memory as long as there is space in it. This strategy is suitable if the persistent storage is less than RAM.
 
Any objections or comments?
 
[1] - org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#afterLogicalUpdatesApplied
[2] - org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#restorePartitionStates

27.07.2020, 16:48, "Вячеслав Коптилин" <[hidden email]>:

> Hello Kirill,
>
> Thanks a lot for driving this activity. If I am not mistaken, this
> discussion relates to IEP-40.
>
>>  I suggest adding a warmup phase after recovery here [1] after [2], before
>
> discovery.
> This means that the user's thread, which starts Ignite via
> Ignition.start(), will wait for ana additional step - cache warm-up.
> I think this fact has to be clearly mentioned in our documentation (at
> Javadocat least) because this step can be time-consuming.
>
>>  I suggest adding a new interface:
>
> I would change it a bit. First of all, it would be nice to place this
> interface to a public package and get rid of using GridCacheContext,
> which is an internal class and it should not leak to the public API in any
> case.
> Perhaps, this parameter is not needed at all or we should add some public
> abstraction instead of internal class.
>
> package org.apache.ignite.configuration;
>
> import org.apache.ignite.IgniteCheckedException;
> import org.apache.ignite.lang.IgniteFuture;
>
> public interface CacheWarmupper {
>     /**
>      * Warmup cache.
>      *
>      * @param cachename Cache name.
>      * @return Future cache warmup.
>      * @throws IgniteCheckedException If failed.
>      */
>     IgniteFuture<?> warmup(String cachename) throws IgniteCheckedException;
> }
>
> Thanks,
> S.
>
> пн, 27 июл. 2020 г. в 15:03, ткаленко кирилл <[hidden email]>:
>
>>  Now, after restarting node, we have only cold caches, which at first
>>  requests to them will gradually load data from disks, which can slow down
>>  first calls to them.
>>  If node has more RAM than data on disk, then they can be loaded at start
>>  "warmup", thereby solving the issue of slowdowns during first calls to
>>  caches.
>>
>>  I suggest adding a warmup phase after recovery here [1] after [2], before
>>  descovery.
>>
>>  I suggest adding a new interface:
>>
>>  package org.apache.ignite.internal.processors.cache;
>>
>>  import org.apache.ignite.IgniteCheckedException;
>>  import org.apache.ignite.internal.IgniteInternalFuture;
>>  import org.jetbrains.annotations.Nullable;
>>
>>  /**
>>   * Interface for warming up cache.
>>   */
>>  public interface CacheWarmup {
>>      /**
>>       * Warmup cache.
>>       *
>>       * @param cacheCtx Cache context.
>>       * @return Future cache warmup.
>>       * @throws IgniteCheckedException if failed.
>>       */
>>      @Nullable IgniteInternalFuture<?> process(GridCacheContext cacheCtx)
>>  throws IgniteCheckedException;
>>  }
>>
>>  Which will allow to warm up caches in parallel and asynchronously. Warmup
>>  phase will end after all IgniteInternalFuture for all caches isDone.
>>
>>  Also adding the ability to customize via methods:
>>  org.apache.ignite.configuration.IgniteConfiguration#setDefaultCacheWarmup
>>  org.apache.ignite.configuration.CacheConfiguration#setCacheWarmup
>>
>>  Which will allow for each cache to set implementation of cache warming up,
>>  both for a specific cache, and for all if necessary.
>>
>>  I suggest adding an implementation of SequentialWarmup that will use [3].
>>
>>  Questions, suggestions, comments?
>>
>>  [1] -
>>  org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#afterLogicalUpdatesApplied
>>  [2] -
>>  org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#restorePartitionStates
>>  [3] -
>>  org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManager.CacheDataStore#preload
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Cache warmup

ткаленко кирилл
In reply to this post by Zhenya Stanilovsky
Hi Eugene!
This will be considered in specific strategies.

28.07.2020, 09:17, "Zhenya Stanilovsky" <[hidden email]>:

> Looks like we need additional func for static caches, for example: warmup(List<CacheConfiguration> cconf) it would be helpful for spring too.
>
>> ------- Forwarded message -------
>> From: "Вячеслав Коптилин" < [hidden email] >
>> To: [hidden email]
>> Cc:
>> Subject: Re: [DISCUSSION] Cache warmup
>> Date: Mon, 27 Jul 2020 16:47:48 +0300
>>
>> Hello Kirill,
>>
>> Thanks a lot for driving this activity. If I am not mistaken, this
>> discussion relates to IEP-40.
>>
>>>  I suggest adding a warmup phase after recovery here [1] after [2], before
>> discovery.
>> This means that the user's thread, which starts Ignite via
>> Ignition.start(), will wait for ana additional step - cache warm-up.
>> I think this fact has to be clearly mentioned in our documentation (at
>> Javadocat least) because this step can be time-consuming.
>>
>>>  I suggest adding a new interface:
>> I would change it a bit. First of all, it would be nice to place this
>> interface to a public package and get rid of using GridCacheContext,
>> which is an internal class and it should not leak to the public API in any
>> case.
>> Perhaps, this parameter is not needed at all or we should add some public
>> abstraction instead of internal class.
>>
>> package org.apache.ignite.configuration;
>>
>> import org.apache.ignite.IgniteCheckedException;
>> import org.apache.ignite.lang.IgniteFuture;
>>
>> public interface CacheWarmupper {
>>       /**
>>        * Warmup cache.
>>        *
>>        * @param cachename Cache name.
>>        * @return Future cache warmup.
>>        * @throws IgniteCheckedException If failed.
>>        */
>>       IgniteFuture<?> warmup(String cachename) throws
>> IgniteCheckedException;
>> }
>>
>> Thanks,
>> S.
>>
>> пн, 27 июл. 2020 г. в 15:03, ткаленко кирилл < [hidden email] >:
>>
>>>  Now, after restarting node, we have only cold caches, which at first
>>>  requests to them will gradually load data from disks, which can slow down
>>>  first calls to them.
>>>  If node has more RAM than data on disk, then they can be loaded at start
>>>  "warmup", thereby solving the issue of slowdowns during first calls to
>>>  caches.
>>>
>>>  I suggest adding a warmup phase after recovery here [1] after [2], before
>>>  descovery.
>>>
>>>  I suggest adding a new interface:
>>>
>>>  package org.apache.ignite.internal.processors.cache;
>>>
>>>  import org.apache.ignite.IgniteCheckedException;
>>>  import org.apache.ignite.internal.IgniteInternalFuture;
>>>  import org.jetbrains.annotations.Nullable;
>>>
>>>  /**
>>>  * Interface for warming up cache.
>>>  */
>>>  public interface CacheWarmup {
>>>  /**
>>>  * Warmup cache.
>>>  *
>>>  * @param cacheCtx Cache context.
>>>  * @return Future cache warmup.
>>>  * @throws IgniteCheckedException if failed.
>>>  */
>>>  @Nullable IgniteInternalFuture<?> process(GridCacheContext cacheCtx)
>>>  throws IgniteCheckedException;
>>>  }
>>>
>>>  Which will allow to warm up caches in parallel and asynchronously. Warmup
>>>  phase will end after all IgniteInternalFuture for all caches isDone.
>>>
>>>  Also adding the ability to customize via methods:
>>>  org.apache.ignite.configuration.IgniteConfiguration#setDefaultCacheWarmup
>>>  org.apache.ignite.configuration.CacheConfiguration#setCacheWarmup
>>>
>>>  Which will allow for each cache to set implementation of cache warming
>>>  up,
>>>  both for a specific cache, and for all if necessary.
>>>
>>>  I suggest adding an implementation of SequentialWarmup that will use [3].
>>>
>>>  Questions, suggestions, comments?
>>>
>>>  [1] -
>>>  org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#afterLogicalUpdatesApplied
>>>  [2] -
>>>  org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#restorePartitionStates
>>>  [3] -
>>>  org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManager.CacheDataStore#preload
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Cache warmup

ткаленко кирилл
In reply to this post by Denis Mekhanikov
Hi, Denis!

Previously, I answered Slava about implementation that I keep in mind, now it will be possible to add own warm-up strategy implementations. Which will be possible to implement in different ways.

At the moment, I suggest implementing one "Load all" strategy, which will be effective if persistent storage is less than RAM.


28.07.2020, 19:46, "Denis Mekhanikov" <[hidden email]>:

> Kirill,
>
> That will be a great feature! Other popular databases already have it (e.g.
> Postgres: https://www.postgresql.org/docs/11/pgprewarm.html), so it's good
> that we're also going to have it in Ignite.
>
> What implementation of CacheWarmup interface do you have in mind? Will
> there be some preconfigured implementation, and will users be able to
> implement it themselves?
>
> Do you think it should be cache-based? I would say that a DataRegion-based
> warm-up would come more naturally. Page IDs that are loaded into the data
> region can be dumped periodically to disk and recovered on restarts. This
> is more or less how it works in Postgres.
> I'm afraid that if we make it cache-based, the implementation won't be that
> obvious. We already have an API for warmup that appeared to be pretty much
> impossible to apply in a useful way:
> https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/IgniteCache.html#preloadPartition-int-
> Let's make sure that our new tool for warming up is actually useful.
>
> Denis
>
> вт, 28 июл. 2020 г. в 09:17, Zhenya Stanilovsky <[hidden email]
>> :
>
>>  Looks like we need additional func for static caches, for
>>  example: warmup(List<CacheConfiguration> cconf) it would be helpful for
>>  spring too.
>>
>>  >
>>  >------- Forwarded message -------
>>  >From: "Вячеслав Коптилин" < [hidden email] >
>>  >To: [hidden email]
>>  >Cc:
>>  >Subject: Re: [DISCUSSION] Cache warmup
>>  >Date: Mon, 27 Jul 2020 16:47:48 +0300
>>  >
>>  >Hello Kirill,
>>  >
>>  >Thanks a lot for driving this activity. If I am not mistaken, this
>>  >discussion relates to IEP-40.
>>  >
>>  >> I suggest adding a warmup phase after recovery here [1] after [2],
>>  before
>>  >discovery.
>>  >This means that the user's thread, which starts Ignite via
>>  >Ignition.start(), will wait for ana additional step - cache warm-up.
>>  >I think this fact has to be clearly mentioned in our documentation (at
>>  >Javadocat least) because this step can be time-consuming.
>>  >
>>  >> I suggest adding a new interface:
>>  >I would change it a bit. First of all, it would be nice to place this
>>  >interface to a public package and get rid of using GridCacheContext,
>>  >which is an internal class and it should not leak to the public API in any
>>  >case.
>>  >Perhaps, this parameter is not needed at all or we should add some public
>>  >abstraction instead of internal class.
>>  >
>>  >package org.apache.ignite.configuration;
>>  >
>>  >import org.apache.ignite.IgniteCheckedException;
>>  >import org.apache.ignite.lang.IgniteFuture;
>>  >
>>  >public interface CacheWarmupper {
>>  > /**
>>  > * Warmup cache.
>>  > *
>>  > * @param cachename Cache name.
>>  > * @return Future cache warmup.
>>  > * @throws IgniteCheckedException If failed.
>>  > */
>>  > IgniteFuture<?> warmup(String cachename) throws
>>  >IgniteCheckedException;
>>  >}
>>  >
>>  >Thanks,
>>  >S.
>>  >
>>  >пн, 27 июл. 2020 г. в 15:03, ткаленко кирилл < [hidden email] >:
>>  >
>>  >> Now, after restarting node, we have only cold caches, which at first
>>  >> requests to them will gradually load data from disks, which can slow
>>  down
>>  >> first calls to them.
>>  >> If node has more RAM than data on disk, then they can be loaded at start
>>  >> "warmup", thereby solving the issue of slowdowns during first calls to
>>  >> caches.
>>  >>
>>  >> I suggest adding a warmup phase after recovery here [1] after [2],
>>  before
>>  >> descovery.
>>  >>
>>  >> I suggest adding a new interface:
>>  >>
>>  >> package org.apache.ignite.internal.processors.cache;
>>  >>
>>  >> import org.apache.ignite.IgniteCheckedException;
>>  >> import org.apache.ignite.internal.IgniteInternalFuture;
>>  >> import org.jetbrains.annotations.Nullable;
>>  >>
>>  >> /**
>>  >> * Interface for warming up cache.
>>  >> */
>>  >> public interface CacheWarmup {
>>  >> /**
>>  >> * Warmup cache.
>>  >> *
>>  >> * @param cacheCtx Cache context.
>>  >> * @return Future cache warmup.
>>  >> * @throws IgniteCheckedException if failed.
>>  >> */
>>  >> @Nullable IgniteInternalFuture<?> process(GridCacheContext cacheCtx)
>>  >> throws IgniteCheckedException;
>>  >> }
>>  >>
>>  >> Which will allow to warm up caches in parallel and asynchronously.
>>  Warmup
>>  >> phase will end after all IgniteInternalFuture for all caches isDone.
>>  >>
>>  >> Also adding the ability to customize via methods:
>>  >>
>>  org.apache.ignite.configuration.IgniteConfiguration#setDefaultCacheWarmup
>>  >> org.apache.ignite.configuration.CacheConfiguration#setCacheWarmup
>>  >>
>>  >> Which will allow for each cache to set implementation of cache warming
>>  >> up,
>>  >> both for a specific cache, and for all if necessary.
>>  >>
>>  >> I suggest adding an implementation of SequentialWarmup that will use
>>  [3].
>>  >>
>>  >> Questions, suggestions, comments?
>>  >>
>>  >> [1] -
>>  >>
>>  org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#afterLogicalUpdatesApplied
>>  >> [2] -
>>  >>
>>  org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#restorePartitionStates
>>  >> [3] -
>>  >>
>>  org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManager.CacheDataStore#preload
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Cache warmup

Denis Mekhanikov
Kirill,

When I discussed this functionality with Ignite users, I heard the
following thoughts about warming up:

   - Node restarts affect performance of queries. The main reason for that
   is that the pages that were loaded into memory before the restart are on
   disk after the restart. It takes time to reach the same distribution of
   data between memory and disk. Until that point the performance is usually
   degraded. No simple rule like "load everything" helps here if only a part
   of data fits in memory.
   - It would be nice to have a way to give preferences to indices when
   doing a warmup. Usually indices are used more often than data nodes, so
   loading indices first would bring more benefits.

The first point can be addressed by implementing the policy that would
restore the memory state that was observed before the restart. I don't see
how it can be implemented using the suggested interface.
The second one requires direct work with data pages, but not with a cache
context, so it's also impossible to implement.

When loading of all cache data is required, it can be done by running a
local scan query. It will iterate through all data pages and result in
their allocation in memory.

So, I don't really see a scenario when the suggested API will help. Do you
have a suitable use-case that will be covered?

Denis

вт, 4 авг. 2020 г. в 13:42, ткаленко кирилл <[hidden email]>:

> Hi, Denis!
>
> Previously, I answered Slava about implementation that I keep in mind, now
> it will be possible to add own warm-up strategy implementations. Which will
> be possible to implement in different ways.
>
> At the moment, I suggest implementing one "Load all" strategy, which will
> be effective if persistent storage is less than RAM.
>
>
> 28.07.2020, 19:46, "Denis Mekhanikov" <[hidden email]>:
> > Kirill,
> >
> > That will be a great feature! Other popular databases already have it
> (e.g.
> > Postgres: https://www.postgresql.org/docs/11/pgprewarm.html), so it's
> good
> > that we're also going to have it in Ignite.
> >
> > What implementation of CacheWarmup interface do you have in mind? Will
> > there be some preconfigured implementation, and will users be able to
> > implement it themselves?
> >
> > Do you think it should be cache-based? I would say that a
> DataRegion-based
> > warm-up would come more naturally. Page IDs that are loaded into the data
> > region can be dumped periodically to disk and recovered on restarts. This
> > is more or less how it works in Postgres.
> > I'm afraid that if we make it cache-based, the implementation won't be
> that
> > obvious. We already have an API for warmup that appeared to be pretty
> much
> > impossible to apply in a useful way:
> >
> https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/IgniteCache.html#preloadPartition-int-
> > Let's make sure that our new tool for warming up is actually useful.
> >
> > Denis
> >
> > вт, 28 июл. 2020 г. в 09:17, Zhenya Stanilovsky
> <[hidden email]
> >> :
> >
> >>  Looks like we need additional func for static caches, for
> >>  example: warmup(List<CacheConfiguration> cconf) it would be helpful for
> >>  spring too.
> >>
> >>  >
> >>  >------- Forwarded message -------
> >>  >From: "Вячеслав Коптилин" < [hidden email] >
> >>  >To: [hidden email]
> >>  >Cc:
> >>  >Subject: Re: [DISCUSSION] Cache warmup
> >>  >Date: Mon, 27 Jul 2020 16:47:48 +0300
> >>  >
> >>  >Hello Kirill,
> >>  >
> >>  >Thanks a lot for driving this activity. If I am not mistaken, this
> >>  >discussion relates to IEP-40.
> >>  >
> >>  >> I suggest adding a warmup phase after recovery here [1] after [2],
> >>  before
> >>  >discovery.
> >>  >This means that the user's thread, which starts Ignite via
> >>  >Ignition.start(), will wait for ana additional step - cache warm-up.
> >>  >I think this fact has to be clearly mentioned in our documentation (at
> >>  >Javadocat least) because this step can be time-consuming.
> >>  >
> >>  >> I suggest adding a new interface:
> >>  >I would change it a bit. First of all, it would be nice to place this
> >>  >interface to a public package and get rid of using GridCacheContext,
> >>  >which is an internal class and it should not leak to the public API
> in any
> >>  >case.
> >>  >Perhaps, this parameter is not needed at all or we should add some
> public
> >>  >abstraction instead of internal class.
> >>  >
> >>  >package org.apache.ignite.configuration;
> >>  >
> >>  >import org.apache.ignite.IgniteCheckedException;
> >>  >import org.apache.ignite.lang.IgniteFuture;
> >>  >
> >>  >public interface CacheWarmupper {
> >>  > /**
> >>  > * Warmup cache.
> >>  > *
> >>  > * @param cachename Cache name.
> >>  > * @return Future cache warmup.
> >>  > * @throws IgniteCheckedException If failed.
> >>  > */
> >>  > IgniteFuture<?> warmup(String cachename) throws
> >>  >IgniteCheckedException;
> >>  >}
> >>  >
> >>  >Thanks,
> >>  >S.
> >>  >
> >>  >пн, 27 июл. 2020 г. в 15:03, ткаленко кирилл < [hidden email]
> >:
> >>  >
> >>  >> Now, after restarting node, we have only cold caches, which at first
> >>  >> requests to them will gradually load data from disks, which can slow
> >>  down
> >>  >> first calls to them.
> >>  >> If node has more RAM than data on disk, then they can be loaded at
> start
> >>  >> "warmup", thereby solving the issue of slowdowns during first calls
> to
> >>  >> caches.
> >>  >>
> >>  >> I suggest adding a warmup phase after recovery here [1] after [2],
> >>  before
> >>  >> descovery.
> >>  >>
> >>  >> I suggest adding a new interface:
> >>  >>
> >>  >> package org.apache.ignite.internal.processors.cache;
> >>  >>
> >>  >> import org.apache.ignite.IgniteCheckedException;
> >>  >> import org.apache.ignite.internal.IgniteInternalFuture;
> >>  >> import org.jetbrains.annotations.Nullable;
> >>  >>
> >>  >> /**
> >>  >> * Interface for warming up cache.
> >>  >> */
> >>  >> public interface CacheWarmup {
> >>  >> /**
> >>  >> * Warmup cache.
> >>  >> *
> >>  >> * @param cacheCtx Cache context.
> >>  >> * @return Future cache warmup.
> >>  >> * @throws IgniteCheckedException if failed.
> >>  >> */
> >>  >> @Nullable IgniteInternalFuture<?> process(GridCacheContext cacheCtx)
> >>  >> throws IgniteCheckedException;
> >>  >> }
> >>  >>
> >>  >> Which will allow to warm up caches in parallel and asynchronously.
> >>  Warmup
> >>  >> phase will end after all IgniteInternalFuture for all caches isDone.
> >>  >>
> >>  >> Also adding the ability to customize via methods:
> >>  >>
> >>
>  org.apache.ignite.configuration.IgniteConfiguration#setDefaultCacheWarmup
> >>  >> org.apache.ignite.configuration.CacheConfiguration#setCacheWarmup
> >>  >>
> >>  >> Which will allow for each cache to set implementation of cache
> warming
> >>  >> up,
> >>  >> both for a specific cache, and for all if necessary.
> >>  >>
> >>  >> I suggest adding an implementation of SequentialWarmup that will use
> >>  [3].
> >>  >>
> >>  >> Questions, suggestions, comments?
> >>  >>
> >>  >> [1] -
> >>  >>
> >>
>  org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#afterLogicalUpdatesApplied
> >>  >> [2] -
> >>  >>
> >>
>  org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#restorePartitionStates
> >>  >> [3] -
> >>  >>
> >>
>  org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManager.CacheDataStore#preload
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Cache warmup

ткаленко кирилл
Hi, Denis!

For now, I suggest a simple warm-up implementation, if the persistent storage is less than RAM. If others want to make additional implementations, they can do it themselves by implementing interfaces. For the first point, we need to figure out how and where we will remember pages, etc. Perhaps for such tasks it will be necessary to make improvements in kernel.

In "WarmUpStrategy#warmUp" method, we get "GridKernalContext#cache" from which we can get with caches and groups through "GridCacheProcessor#cacheGroups", "GridCacheProcessor#caches" and so on, we can access to pages.
> The second one requires direct work with data pages, but not with a cache
> context, so it's also impossible to implement.

This requires writing additional custom code, which may run longer due to its SQL features, and so on.
It would be more convenient to just set a warm-up strategy for both developer and grid administrator.
> When loading of all cache data is required, it can be done by running a
> local scan query. It will iterate through all data pages and result in
> their allocation in memory.

04.08.2020, 15:25, "Denis Mekhanikov" <[hidden email]>:

> Kirill,
>
> When I discussed this functionality with Ignite users, I heard the
> following thoughts about warming up:
>
>    - Node restarts affect performance of queries. The main reason for that
>    is that the pages that were loaded into memory before the restart are on
>    disk after the restart. It takes time to reach the same distribution of
>    data between memory and disk. Until that point the performance is usually
>    degraded. No simple rule like "load everything" helps here if only a part
>    of data fits in memory.
>    - It would be nice to have a way to give preferences to indices when
>    doing a warmup. Usually indices are used more often than data nodes, so
>    loading indices first would bring more benefits.
>
> The first point can be addressed by implementing the policy that would
> restore the memory state that was observed before the restart. I don't see
> how it can be implemented using the suggested interface.
> The second one requires direct work with data pages, but not with a cache
> context, so it's also impossible to implement.
>
> When loading of all cache data is required, it can be done by running a
> local scan query. It will iterate through all data pages and result in
> their allocation in memory.
>
> So, I don't really see a scenario when the suggested API will help. Do you
> have a suitable use-case that will be covered?
>
> Denis
>
> вт, 4 авг. 2020 г. в 13:42, ткаленко кирилл <[hidden email]>:
>
>>  Hi, Denis!
>>
>>  Previously, I answered Slava about implementation that I keep in mind, now
>>  it will be possible to add own warm-up strategy implementations. Which will
>>  be possible to implement in different ways.
>>
>>  At the moment, I suggest implementing one "Load all" strategy, which will
>>  be effective if persistent storage is less than RAM.
>>
>>  28.07.2020, 19:46, "Denis Mekhanikov" <[hidden email]>:
>>  > Kirill,
>>  >
>>  > That will be a great feature! Other popular databases already have it
>>  (e.g.
>>  > Postgres: https://www.postgresql.org/docs/11/pgprewarm.html), so it's
>>  good
>>  > that we're also going to have it in Ignite.
>>  >
>>  > What implementation of CacheWarmup interface do you have in mind? Will
>>  > there be some preconfigured implementation, and will users be able to
>>  > implement it themselves?
>>  >
>>  > Do you think it should be cache-based? I would say that a
>>  DataRegion-based
>>  > warm-up would come more naturally. Page IDs that are loaded into the data
>>  > region can be dumped periodically to disk and recovered on restarts. This
>>  > is more or less how it works in Postgres.
>>  > I'm afraid that if we make it cache-based, the implementation won't be
>>  that
>>  > obvious. We already have an API for warmup that appeared to be pretty
>>  much
>>  > impossible to apply in a useful way:
>>  >
>>  https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/IgniteCache.html#preloadPartition-int-
>>  > Let's make sure that our new tool for warming up is actually useful.
>>  >
>>  > Denis
>>  >
>>  > вт, 28 июл. 2020 г. в 09:17, Zhenya Stanilovsky
>>  <[hidden email]
>>  >> :
>>  >
>>  >> Looks like we need additional func for static caches, for
>>  >> example: warmup(List<CacheConfiguration> cconf) it would be helpful for
>>  >> spring too.
>>  >>
>>  >> >
>>  >> >------- Forwarded message -------
>>  >> >From: "Вячеслав Коптилин" < [hidden email] >
>>  >> >To: [hidden email]
>>  >> >Cc:
>>  >> >Subject: Re: [DISCUSSION] Cache warmup
>>  >> >Date: Mon, 27 Jul 2020 16:47:48 +0300
>>  >> >
>>  >> >Hello Kirill,
>>  >> >
>>  >> >Thanks a lot for driving this activity. If I am not mistaken, this
>>  >> >discussion relates to IEP-40.
>>  >> >
>>  >> >> I suggest adding a warmup phase after recovery here [1] after [2],
>>  >> before
>>  >> >discovery.
>>  >> >This means that the user's thread, which starts Ignite via
>>  >> >Ignition.start(), will wait for ana additional step - cache warm-up.
>>  >> >I think this fact has to be clearly mentioned in our documentation (at
>>  >> >Javadocat least) because this step can be time-consuming.
>>  >> >
>>  >> >> I suggest adding a new interface:
>>  >> >I would change it a bit. First of all, it would be nice to place this
>>  >> >interface to a public package and get rid of using GridCacheContext,
>>  >> >which is an internal class and it should not leak to the public API
>>  in any
>>  >> >case.
>>  >> >Perhaps, this parameter is not needed at all or we should add some
>>  public
>>  >> >abstraction instead of internal class.
>>  >> >
>>  >> >package org.apache.ignite.configuration;
>>  >> >
>>  >> >import org.apache.ignite.IgniteCheckedException;
>>  >> >import org.apache.ignite.lang.IgniteFuture;
>>  >> >
>>  >> >public interface CacheWarmupper {
>>  >> > /**
>>  >> > * Warmup cache.
>>  >> > *
>>  >> > * @param cachename Cache name.
>>  >> > * @return Future cache warmup.
>>  >> > * @throws IgniteCheckedException If failed.
>>  >> > */
>>  >> > IgniteFuture<?> warmup(String cachename) throws
>>  >> >IgniteCheckedException;
>>  >> >}
>>  >> >
>>  >> >Thanks,
>>  >> >S.
>>  >> >
>>  >> >пн, 27 июл. 2020 г. в 15:03, ткаленко кирилл < [hidden email]
>>  >:
>>  >> >
>>  >> >> Now, after restarting node, we have only cold caches, which at first
>>  >> >> requests to them will gradually load data from disks, which can slow
>>  >> down
>>  >> >> first calls to them.
>>  >> >> If node has more RAM than data on disk, then they can be loaded at
>>  start
>>  >> >> "warmup", thereby solving the issue of slowdowns during first calls
>>  to
>>  >> >> caches.
>>  >> >>
>>  >> >> I suggest adding a warmup phase after recovery here [1] after [2],
>>  >> before
>>  >> >> descovery.
>>  >> >>
>>  >> >> I suggest adding a new interface:
>>  >> >>
>>  >> >> package org.apache.ignite.internal.processors.cache;
>>  >> >>
>>  >> >> import org.apache.ignite.IgniteCheckedException;
>>  >> >> import org.apache.ignite.internal.IgniteInternalFuture;
>>  >> >> import org.jetbrains.annotations.Nullable;
>>  >> >>
>>  >> >> /**
>>  >> >> * Interface for warming up cache.
>>  >> >> */
>>  >> >> public interface CacheWarmup {
>>  >> >> /**
>>  >> >> * Warmup cache.
>>  >> >> *
>>  >> >> * @param cacheCtx Cache context.
>>  >> >> * @return Future cache warmup.
>>  >> >> * @throws IgniteCheckedException if failed.
>>  >> >> */
>>  >> >> @Nullable IgniteInternalFuture<?> process(GridCacheContext cacheCtx)
>>  >> >> throws IgniteCheckedException;
>>  >> >> }
>>  >> >>
>>  >> >> Which will allow to warm up caches in parallel and asynchronously.
>>  >> Warmup
>>  >> >> phase will end after all IgniteInternalFuture for all caches isDone.
>>  >> >>
>>  >> >> Also adding the ability to customize via methods:
>>  >> >>
>>  >>
>>   org.apache.ignite.configuration.IgniteConfiguration#setDefaultCacheWarmup
>>  >> >> org.apache.ignite.configuration.CacheConfiguration#setCacheWarmup
>>  >> >>
>>  >> >> Which will allow for each cache to set implementation of cache
>>  warming
>>  >> >> up,
>>  >> >> both for a specific cache, and for all if necessary.
>>  >> >>
>>  >> >> I suggest adding an implementation of SequentialWarmup that will use
>>  >> [3].
>>  >> >>
>>  >> >> Questions, suggestions, comments?
>>  >> >>
>>  >> >> [1] -
>>  >> >>
>>  >>
>>   org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#afterLogicalUpdatesApplied
>>  >> >> [2] -
>>  >> >>
>>  >>
>>   org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#restorePartitionStates
>>  >> >> [3] -
>>  >> >>
>>  >>
>>   org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManager.CacheDataStore#preload
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Cache warmup

Stanislav Lukyanov
Kirill,

Thanks for driving this. This is awaited by many users.

A few comments and questions.


I would keep CacheWarmup interface purely internal and never view it as an interface which a user would be implementing.
There are multiple reasons for that:
- The logic of the cache warmup is very low-level; how a user is supposed to know which pages they want?
- A sophisticated strategy will require accessing private APIs for sure; say, I need a strategy which loads the last known memory state before the restart; how can I even implement that without breaking into various internals?
- In fact there aren't many implementations which make sense ("load everything", "load indexes", "load last memory state", "load N GB at random"); every use case I've seen would be solved by a "load everything" strategy (if disk is < RAM) or "load last memory state" strategy
- Warmup will be a critical phase, and a custom user implementation is all too likely to cause issues. We should avoid executing user code in critical stages if we can help it
To summarize, if we give warmup strategies in users' hands they will be hard to write, will require breaking into internals or a lot of additional public interfaces for these internals, will likely cause issues with the cluster, and everyone will be implementing the same few general strategies.
Basically, I expect only fellow Ignite developers to be implementing their own strategies.
Because of that I propose to keep the interfaces private, and only give a single public parameter. The parameter can take an enum of the supported strategies. New useful strategies should be added to Ignite codebase.


Will there be a way to interrupt warmup phase and continue startup (e.g. via JMX, REST and/or control.sh)? Can we have it please?


I think that ideally warmup should be configured per-cache - I believe this is what a user would expect to do.
However, cache configs are immutable. We need a way for existing users to enjoy the cache warmup feature, as well as for early adopters to switch to more sophisticated strategies as they will be released (or as their dataset grows).
Because of that I propose to add the cache warmup configuration to the DataRegionConfiguration. Data regions can be changed between restarts, independently on each node allowing for a rolling change.


Will preloadPartition() method be deprecated together with this change? I assume yes?


How hard would it be to implement a "load all indexes, metapages and freelists" strategy in addition to the "load everything"?
I think it would be an MVP for environments with a datasets larger than RAM. A "load everything" strategy will not work in this environments pretty much at all, and "load indexes" will be a significant improvement to no warmup at all.

Thanks,
Stan

> On 4 Aug 2020, at 16:04, ткаленко кирилл <[hidden email]> wrote:
>
> Hi, Denis!
>
> For now, I suggest a simple warm-up implementation, if the persistent storage is less than RAM. If others want to make additional implementations, they can do it themselves by implementing interfaces. For the first point, we need to figure out how and where we will remember pages, etc. Perhaps for such tasks it will be necessary to make improvements in kernel.
>
> In "WarmUpStrategy#warmUp" method, we get "GridKernalContext#cache" from which we can get with caches and groups through "GridCacheProcessor#cacheGroups", "GridCacheProcessor#caches" and so on, we can access to pages.
>> The second one requires direct work with data pages, but not with a cache
>> context, so it's also impossible to implement.
>
> This requires writing additional custom code, which may run longer due to its SQL features, and so on.
> It would be more convenient to just set a warm-up strategy for both developer and grid administrator.
>> When loading of all cache data is required, it can be done by running a
>> local scan query. It will iterate through all data pages and result in
>> their allocation in memory.
>
> 04.08.2020, 15:25, "Denis Mekhanikov" <[hidden email]>:
>> Kirill,
>>
>> When I discussed this functionality with Ignite users, I heard the
>> following thoughts about warming up:
>>
>>    - Node restarts affect performance of queries. The main reason for that
>>    is that the pages that were loaded into memory before the restart are on
>>    disk after the restart. It takes time to reach the same distribution of
>>    data between memory and disk. Until that point the performance is usually
>>    degraded. No simple rule like "load everything" helps here if only a part
>>    of data fits in memory.
>>    - It would be nice to have a way to give preferences to indices when
>>    doing a warmup. Usually indices are used more often than data nodes, so
>>    loading indices first would bring more benefits.
>>
>> The first point can be addressed by implementing the policy that would
>> restore the memory state that was observed before the restart. I don't see
>> how it can be implemented using the suggested interface.
>> The second one requires direct work with data pages, but not with a cache
>> context, so it's also impossible to implement.
>>
>> When loading of all cache data is required, it can be done by running a
>> local scan query. It will iterate through all data pages and result in
>> their allocation in memory.
>>
>> So, I don't really see a scenario when the suggested API will help. Do you
>> have a suitable use-case that will be covered?
>>
>> Denis
>>
>> вт, 4 авг. 2020 г. в 13:42, ткаленко кирилл <[hidden email]>:
>>
>>>  Hi, Denis!
>>>
>>>  Previously, I answered Slava about implementation that I keep in mind, now
>>>  it will be possible to add own warm-up strategy implementations. Which will
>>>  be possible to implement in different ways.
>>>
>>>  At the moment, I suggest implementing one "Load all" strategy, which will
>>>  be effective if persistent storage is less than RAM.
>>>
>>>  28.07.2020, 19:46, "Denis Mekhanikov" <[hidden email]>:
>>>  > Kirill,
>>>  >
>>>  > That will be a great feature! Other popular databases already have it
>>>  (e.g.
>>>  > Postgres: https://www.postgresql.org/docs/11/pgprewarm.html), so it's
>>>  good
>>>  > that we're also going to have it in Ignite.
>>>  >
>>>  > What implementation of CacheWarmup interface do you have in mind? Will
>>>  > there be some preconfigured implementation, and will users be able to
>>>  > implement it themselves?
>>>  >
>>>  > Do you think it should be cache-based? I would say that a
>>>  DataRegion-based
>>>  > warm-up would come more naturally. Page IDs that are loaded into the data
>>>  > region can be dumped periodically to disk and recovered on restarts. This
>>>  > is more or less how it works in Postgres.
>>>  > I'm afraid that if we make it cache-based, the implementation won't be
>>>  that
>>>  > obvious. We already have an API for warmup that appeared to be pretty
>>>  much
>>>  > impossible to apply in a useful way:
>>>  >
>>>  https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/IgniteCache.html#preloadPartition-int-
>>>  > Let's make sure that our new tool for warming up is actually useful.
>>>  >
>>>  > Denis
>>>  >
>>>  > вт, 28 июл. 2020 г. в 09:17, Zhenya Stanilovsky
>>>  <[hidden email]
>>>  >> :
>>>  >
>>>  >> Looks like we need additional func for static caches, for
>>>  >> example: warmup(List<CacheConfiguration> cconf) it would be helpful for
>>>  >> spring too.
>>>  >>
>>>  >> >
>>>  >> >------- Forwarded message -------
>>>  >> >From: "Вячеслав Коптилин" < [hidden email] >
>>>  >> >To: [hidden email]
>>>  >> >Cc:
>>>  >> >Subject: Re: [DISCUSSION] Cache warmup
>>>  >> >Date: Mon, 27 Jul 2020 16:47:48 +0300
>>>  >> >
>>>  >> >Hello Kirill,
>>>  >> >
>>>  >> >Thanks a lot for driving this activity. If I am not mistaken, this
>>>  >> >discussion relates to IEP-40.
>>>  >> >
>>>  >> >> I suggest adding a warmup phase after recovery here [1] after [2],
>>>  >> before
>>>  >> >discovery.
>>>  >> >This means that the user's thread, which starts Ignite via
>>>  >> >Ignition.start(), will wait for ana additional step - cache warm-up.
>>>  >> >I think this fact has to be clearly mentioned in our documentation (at
>>>  >> >Javadocat least) because this step can be time-consuming.
>>>  >> >
>>>  >> >> I suggest adding a new interface:
>>>  >> >I would change it a bit. First of all, it would be nice to place this
>>>  >> >interface to a public package and get rid of using GridCacheContext,
>>>  >> >which is an internal class and it should not leak to the public API
>>>  in any
>>>  >> >case.
>>>  >> >Perhaps, this parameter is not needed at all or we should add some
>>>  public
>>>  >> >abstraction instead of internal class.
>>>  >> >
>>>  >> >package org.apache.ignite.configuration;
>>>  >> >
>>>  >> >import org.apache.ignite.IgniteCheckedException;
>>>  >> >import org.apache.ignite.lang.IgniteFuture;
>>>  >> >
>>>  >> >public interface CacheWarmupper {
>>>  >> > /**
>>>  >> > * Warmup cache.
>>>  >> > *
>>>  >> > * @param cachename Cache name.
>>>  >> > * @return Future cache warmup.
>>>  >> > * @throws IgniteCheckedException If failed.
>>>  >> > */
>>>  >> > IgniteFuture<?> warmup(String cachename) throws
>>>  >> >IgniteCheckedException;
>>>  >> >}
>>>  >> >
>>>  >> >Thanks,
>>>  >> >S.
>>>  >> >
>>>  >> >пн, 27 июл. 2020 г. в 15:03, ткаленко кирилл < [hidden email]
>>>  >:
>>>  >> >
>>>  >> >> Now, after restarting node, we have only cold caches, which at first
>>>  >> >> requests to them will gradually load data from disks, which can slow
>>>  >> down
>>>  >> >> first calls to them.
>>>  >> >> If node has more RAM than data on disk, then they can be loaded at
>>>  start
>>>  >> >> "warmup", thereby solving the issue of slowdowns during first calls
>>>  to
>>>  >> >> caches.
>>>  >> >>
>>>  >> >> I suggest adding a warmup phase after recovery here [1] after [2],
>>>  >> before
>>>  >> >> descovery.
>>>  >> >>
>>>  >> >> I suggest adding a new interface:
>>>  >> >>
>>>  >> >> package org.apache.ignite.internal.processors.cache;
>>>  >> >>
>>>  >> >> import org.apache.ignite.IgniteCheckedException;
>>>  >> >> import org.apache.ignite.internal.IgniteInternalFuture;
>>>  >> >> import org.jetbrains.annotations.Nullable;
>>>  >> >>
>>>  >> >> /**
>>>  >> >> * Interface for warming up cache.
>>>  >> >> */
>>>  >> >> public interface CacheWarmup {
>>>  >> >> /**
>>>  >> >> * Warmup cache.
>>>  >> >> *
>>>  >> >> * @param cacheCtx Cache context.
>>>  >> >> * @return Future cache warmup.
>>>  >> >> * @throws IgniteCheckedException if failed.
>>>  >> >> */
>>>  >> >> @Nullable IgniteInternalFuture<?> process(GridCacheContext cacheCtx)
>>>  >> >> throws IgniteCheckedException;
>>>  >> >> }
>>>  >> >>
>>>  >> >> Which will allow to warm up caches in parallel and asynchronously.
>>>  >> Warmup
>>>  >> >> phase will end after all IgniteInternalFuture for all caches isDone.
>>>  >> >>
>>>  >> >> Also adding the ability to customize via methods:
>>>  >> >>
>>>  >>
>>>   org.apache.ignite.configuration.IgniteConfiguration#setDefaultCacheWarmup
>>>  >> >> org.apache.ignite.configuration.CacheConfiguration#setCacheWarmup
>>>  >> >>
>>>  >> >> Which will allow for each cache to set implementation of cache
>>>  warming
>>>  >> >> up,
>>>  >> >> both for a specific cache, and for all if necessary.
>>>  >> >>
>>>  >> >> I suggest adding an implementation of SequentialWarmup that will use
>>>  >> [3].
>>>  >> >>
>>>  >> >> Questions, suggestions, comments?
>>>  >> >>
>>>  >> >> [1] -
>>>  >> >>
>>>  >>
>>>   org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#afterLogicalUpdatesApplied
>>>  >> >> [2] -
>>>  >> >>
>>>  >>
>>>   org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#restorePartitionStates
>>>  >> >> [3] -
>>>  >> >>
>>>  >>
>>>   org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManager.CacheDataStore#preload

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Cache warmup

ткаленко кирилл
Hi, Stas!

After talking with Anton and Alexy about "IP40", I changed description of implementation in form of a response to Slava, here [1]. In short, I made three separate interfaces, first public for strategy configuration, second internal for strategy implementation, and third for possible delivery of strategies from different plugins.

I will try to think about this and implement it. Warm-up phase will be up to "discovery" and while I'm not sure that it will be possible to connect via control.sh, perhaps it will be possible via jmx, but I think it will be better via control.sh
> Will there be a way to interrupt warmup phase and continue startup (e.g. via JMX, REST and/or control.sh)? Can we have it please?

I was thinking about how and where to make warm-up configuration and I think it would be better to do it in IgniteConfiguration since each strategy can work for caches, groups, regions, etc.
> I think that ideally warmup should be configured per-cache - I believe this is what a user would expect to do.
> However, cache configs are immutable. We need a way for existing users to enjoy the cache warmup feature, as well as for early adopters to switch to more > > > sophisticated strategies as they will be released (or as their dataset grows).
> Because of that I propose to add the cache warmup configuration to the DataRegionConfiguration. Data regions can be changed between restarts, independently > on each node allowing for a rolling change.

Possible.
> Will preloadPartition() method be deprecated together with this change? I assume yes?

I think it can be done as a new strategy, but this is at discretion of developers.
> How hard would it be to implement a "load all indexes, metapages and freelists" strategy in addition to the "load everything"?
> I think it would be an MVP for environments with a datasets larger than RAM. A "load everything" strategy will not work in this environments pretty much at all,
> and "load indexes" will be a significant improvement to no warmup at all.

[1] - http://apache-ignite-developers.2346864.n4.nabble.com/DISCUSSION-Cache-warmup-td48582.html#a48649


04.08.2020, 23:22, "Stanislav Lukyanov" <[hidden email]>:

> Kirill,
>
> Thanks for driving this. This is awaited by many users.
>
> A few comments and questions.
>
> I would keep CacheWarmup interface purely internal and never view it as an interface which a user would be implementing.
> There are multiple reasons for that:
> - The logic of the cache warmup is very low-level; how a user is supposed to know which pages they want?
> - A sophisticated strategy will require accessing private APIs for sure; say, I need a strategy which loads the last known memory state before the restart; how can I even implement that without breaking into various internals?
> - In fact there aren't many implementations which make sense ("load everything", "load indexes", "load last memory state", "load N GB at random"); every use case I've seen would be solved by a "load everything" strategy (if disk is < RAM) or "load last memory state" strategy
> - Warmup will be a critical phase, and a custom user implementation is all too likely to cause issues. We should avoid executing user code in critical stages if we can help it
> To summarize, if we give warmup strategies in users' hands they will be hard to write, will require breaking into internals or a lot of additional public interfaces for these internals, will likely cause issues with the cluster, and everyone will be implementing the same few general strategies.
> Basically, I expect only fellow Ignite developers to be implementing their own strategies.
> Because of that I propose to keep the interfaces private, and only give a single public parameter. The parameter can take an enum of the supported strategies. New useful strategies should be added to Ignite codebase.
>
> Will there be a way to interrupt warmup phase and continue startup (e.g. via JMX, REST and/or control.sh)? Can we have it please?
>
> I think that ideally warmup should be configured per-cache - I believe this is what a user would expect to do.
> However, cache configs are immutable. We need a way for existing users to enjoy the cache warmup feature, as well as for early adopters to switch to more sophisticated strategies as they will be released (or as their dataset grows).
> Because of that I propose to add the cache warmup configuration to the DataRegionConfiguration. Data regions can be changed between restarts, independently on each node allowing for a rolling change.
>
> Will preloadPartition() method be deprecated together with this change? I assume yes?
>
> How hard would it be to implement a "load all indexes, metapages and freelists" strategy in addition to the "load everything"?
> I think it would be an MVP for environments with a datasets larger than RAM. A "load everything" strategy will not work in this environments pretty much at all, and "load indexes" will be a significant improvement to no warmup at all.
>
> Thanks,
> Stan
>
>>  On 4 Aug 2020, at 16:04, ткаленко кирилл <[hidden email]> wrote:
>>
>>  Hi, Denis!
>>
>>  For now, I suggest a simple warm-up implementation, if the persistent storage is less than RAM. If others want to make additional implementations, they can do it themselves by implementing interfaces. For the first point, we need to figure out how and where we will remember pages, etc. Perhaps for such tasks it will be necessary to make improvements in kernel.
>>
>>  In "WarmUpStrategy#warmUp" method, we get "GridKernalContext#cache" from which we can get with caches and groups through "GridCacheProcessor#cacheGroups", "GridCacheProcessor#caches" and so on, we can access to pages.
>>>  The second one requires direct work with data pages, but not with a cache
>>>  context, so it's also impossible to implement.
>>
>>  This requires writing additional custom code, which may run longer due to its SQL features, and so on.
>>  It would be more convenient to just set a warm-up strategy for both developer and grid administrator.
>>>  When loading of all cache data is required, it can be done by running a
>>>  local scan query. It will iterate through all data pages and result in
>>>  their allocation in memory.
>>
>>  04.08.2020, 15:25, "Denis Mekhanikov" <[hidden email]>:
>>>  Kirill,
>>>
>>>  When I discussed this functionality with Ignite users, I heard the
>>>  following thoughts about warming up:
>>>
>>>     - Node restarts affect performance of queries. The main reason for that
>>>     is that the pages that were loaded into memory before the restart are on
>>>     disk after the restart. It takes time to reach the same distribution of
>>>     data between memory and disk. Until that point the performance is usually
>>>     degraded. No simple rule like "load everything" helps here if only a part
>>>     of data fits in memory.
>>>     - It would be nice to have a way to give preferences to indices when
>>>     doing a warmup. Usually indices are used more often than data nodes, so
>>>     loading indices first would bring more benefits.
>>>
>>>  The first point can be addressed by implementing the policy that would
>>>  restore the memory state that was observed before the restart. I don't see
>>>  how it can be implemented using the suggested interface.
>>>  The second one requires direct work with data pages, but not with a cache
>>>  context, so it's also impossible to implement.
>>>
>>>  When loading of all cache data is required, it can be done by running a
>>>  local scan query. It will iterate through all data pages and result in
>>>  their allocation in memory.
>>>
>>>  So, I don't really see a scenario when the suggested API will help. Do you
>>>  have a suitable use-case that will be covered?
>>>
>>>  Denis
>>>
>>>  вт, 4 авг. 2020 г. в 13:42, ткаленко кирилл <[hidden email]>:
>>>
>>>>   Hi, Denis!
>>>>
>>>>   Previously, I answered Slava about implementation that I keep in mind, now
>>>>   it will be possible to add own warm-up strategy implementations. Which will
>>>>   be possible to implement in different ways.
>>>>
>>>>   At the moment, I suggest implementing one "Load all" strategy, which will
>>>>   be effective if persistent storage is less than RAM.
>>>>
>>>>   28.07.2020, 19:46, "Denis Mekhanikov" <[hidden email]>:
>>>>   > Kirill,
>>>>   >
>>>>   > That will be a great feature! Other popular databases already have it
>>>>   (e.g.
>>>>   > Postgres: https://www.postgresql.org/docs/11/pgprewarm.html), so it's
>>>>   good
>>>>   > that we're also going to have it in Ignite.
>>>>   >
>>>>   > What implementation of CacheWarmup interface do you have in mind? Will
>>>>   > there be some preconfigured implementation, and will users be able to
>>>>   > implement it themselves?
>>>>   >
>>>>   > Do you think it should be cache-based? I would say that a
>>>>   DataRegion-based
>>>>   > warm-up would come more naturally. Page IDs that are loaded into the data
>>>>   > region can be dumped periodically to disk and recovered on restarts. This
>>>>   > is more or less how it works in Postgres.
>>>>   > I'm afraid that if we make it cache-based, the implementation won't be
>>>>   that
>>>>   > obvious. We already have an API for warmup that appeared to be pretty
>>>>   much
>>>>   > impossible to apply in a useful way:
>>>>   >
>>>>   https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/IgniteCache.html#preloadPartition-int-
>>>>   > Let's make sure that our new tool for warming up is actually useful.
>>>>   >
>>>>   > Denis
>>>>   >
>>>>   > вт, 28 июл. 2020 г. в 09:17, Zhenya Stanilovsky
>>>>   <[hidden email]
>>>>   >> :
>>>>   >
>>>>   >> Looks like we need additional func for static caches, for
>>>>   >> example: warmup(List<CacheConfiguration> cconf) it would be helpful for
>>>>   >> spring too.
>>>>   >>
>>>>   >> >
>>>>   >> >------- Forwarded message -------
>>>>   >> >From: "Вячеслав Коптилин" < [hidden email] >
>>>>   >> >To: [hidden email]
>>>>   >> >Cc:
>>>>   >> >Subject: Re: [DISCUSSION] Cache warmup
>>>>   >> >Date: Mon, 27 Jul 2020 16:47:48 +0300
>>>>   >> >
>>>>   >> >Hello Kirill,
>>>>   >> >
>>>>   >> >Thanks a lot for driving this activity. If I am not mistaken, this
>>>>   >> >discussion relates to IEP-40.
>>>>   >> >
>>>>   >> >> I suggest adding a warmup phase after recovery here [1] after [2],
>>>>   >> before
>>>>   >> >discovery.
>>>>   >> >This means that the user's thread, which starts Ignite via
>>>>   >> >Ignition.start(), will wait for ana additional step - cache warm-up.
>>>>   >> >I think this fact has to be clearly mentioned in our documentation (at
>>>>   >> >Javadocat least) because this step can be time-consuming.
>>>>   >> >
>>>>   >> >> I suggest adding a new interface:
>>>>   >> >I would change it a bit. First of all, it would be nice to place this
>>>>   >> >interface to a public package and get rid of using GridCacheContext,
>>>>   >> >which is an internal class and it should not leak to the public API
>>>>   in any
>>>>   >> >case.
>>>>   >> >Perhaps, this parameter is not needed at all or we should add some
>>>>   public
>>>>   >> >abstraction instead of internal class.
>>>>   >> >
>>>>   >> >package org.apache.ignite.configuration;
>>>>   >> >
>>>>   >> >import org.apache.ignite.IgniteCheckedException;
>>>>   >> >import org.apache.ignite.lang.IgniteFuture;
>>>>   >> >
>>>>   >> >public interface CacheWarmupper {
>>>>   >> > /**
>>>>   >> > * Warmup cache.
>>>>   >> > *
>>>>   >> > * @param cachename Cache name.
>>>>   >> > * @return Future cache warmup.
>>>>   >> > * @throws IgniteCheckedException If failed.
>>>>   >> > */
>>>>   >> > IgniteFuture<?> warmup(String cachename) throws
>>>>   >> >IgniteCheckedException;
>>>>   >> >}
>>>>   >> >
>>>>   >> >Thanks,
>>>>   >> >S.
>>>>   >> >
>>>>   >> >пн, 27 июл. 2020 г. в 15:03, ткаленко кирилл < [hidden email]
>>>>   >:
>>>>   >> >
>>>>   >> >> Now, after restarting node, we have only cold caches, which at first
>>>>   >> >> requests to them will gradually load data from disks, which can slow
>>>>   >> down
>>>>   >> >> first calls to them.
>>>>   >> >> If node has more RAM than data on disk, then they can be loaded at
>>>>   start
>>>>   >> >> "warmup", thereby solving the issue of slowdowns during first calls
>>>>   to
>>>>   >> >> caches.
>>>>   >> >>
>>>>   >> >> I suggest adding a warmup phase after recovery here [1] after [2],
>>>>   >> before
>>>>   >> >> descovery.
>>>>   >> >>
>>>>   >> >> I suggest adding a new interface:
>>>>   >> >>
>>>>   >> >> package org.apache.ignite.internal.processors.cache;
>>>>   >> >>
>>>>   >> >> import org.apache.ignite.IgniteCheckedException;
>>>>   >> >> import org.apache.ignite.internal.IgniteInternalFuture;
>>>>   >> >> import org.jetbrains.annotations.Nullable;
>>>>   >> >>
>>>>   >> >> /**
>>>>   >> >> * Interface for warming up cache.
>>>>   >> >> */
>>>>   >> >> public interface CacheWarmup {
>>>>   >> >> /**
>>>>   >> >> * Warmup cache.
>>>>   >> >> *
>>>>   >> >> * @param cacheCtx Cache context.
>>>>   >> >> * @return Future cache warmup.
>>>>   >> >> * @throws IgniteCheckedException if failed.
>>>>   >> >> */
>>>>   >> >> @Nullable IgniteInternalFuture<?> process(GridCacheContext cacheCtx)
>>>>   >> >> throws IgniteCheckedException;
>>>>   >> >> }
>>>>   >> >>
>>>>   >> >> Which will allow to warm up caches in parallel and asynchronously.
>>>>   >> Warmup
>>>>   >> >> phase will end after all IgniteInternalFuture for all caches isDone.
>>>>   >> >>
>>>>   >> >> Also adding the ability to customize via methods:
>>>>   >> >>
>>>>   >>
>>>>    org.apache.ignite.configuration.IgniteConfiguration#setDefaultCacheWarmup
>>>>   >> >> org.apache.ignite.configuration.CacheConfiguration#setCacheWarmup
>>>>   >> >>
>>>>   >> >> Which will allow for each cache to set implementation of cache
>>>>   warming
>>>>   >> >> up,
>>>>   >> >> both for a specific cache, and for all if necessary.
>>>>   >> >>
>>>>   >> >> I suggest adding an implementation of SequentialWarmup that will use
>>>>   >> [3].
>>>>   >> >>
>>>>   >> >> Questions, suggestions, comments?
>>>>   >> >>
>>>>   >> >> [1] -
>>>>   >> >>
>>>>   >>
>>>>    org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#afterLogicalUpdatesApplied
>>>>   >> >> [2] -
>>>>   >> >>
>>>>   >>
>>>>    org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#restorePartitionStates
>>>>   >> >> [3] -
>>>>   >> >>
>>>>   >>
>>>>    org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManager.CacheDataStore#preload
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Cache warmup

Alexey Goncharuk
Kirill,

Thank you for driving this discussion and implementation.

A few points from my side:
* Agree that it will be best to keep the strategy interface private because
it will be very dependent on the persistent storage implementation. We
would need to expose page IDs and types to public API, which is very
restrictive. The configuration part obviously needs to be public, and
ability to pull the strategy implementation from plugin is a good idea.
* I was also thinking of adding the warmup configuration straight to the
IgniteConfiguration, but I like Stan's idea of adding it to
DataRegionConfiguration. No strong preference here.
* I do not think we need to deprecate preloadPartition() method. One of the
use-cases for this method was to process partitions sequentially while a
node is running. This method is able to fetch the partition from disk much
(from times to orders of magnitude) faster than sequential scan.
* Being able to cancel the warmup during startup is a great feature. We
should be able to support it from control.sh because the warmup runs before
discovery which starts the last, so the control.sh handler should be
already running.
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Cache warmup

ткаленко кирилл
Hello, Alexey!

Your comments are fair.

05.08.2020, 15:51, "Alexey Goncharuk" <[hidden email]>:

> Kirill,
>
> Thank you for driving this discussion and implementation.
>
> A few points from my side:
> * Agree that it will be best to keep the strategy interface private because
> it will be very dependent on the persistent storage implementation. We
> would need to expose page IDs and types to public API, which is very
> restrictive. The configuration part obviously needs to be public, and
> ability to pull the strategy implementation from plugin is a good idea.
> * I was also thinking of adding the warmup configuration straight to the
> IgniteConfiguration, but I like Stan's idea of adding it to
> DataRegionConfiguration. No strong preference here.
> * I do not think we need to deprecate preloadPartition() method. One of the
> use-cases for this method was to process partitions sequentially while a
> node is running. This method is able to fetch the partition from disk much
> (from times to orders of magnitude) faster than sequential scan.
> * Being able to cancel the warmup during startup is a great feature. We
> should be able to support it from control.sh because the warmup runs before
> discovery which starts the last, so the control.sh handler should be
> already running.
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Cache warmup

Stanislav Lukyanov
Hi Kirill, Alexey,

On the interface structure.
So, as a user I'll see one interface, WarmUpConfiguration, with no methods.
I choose an implementation and configure it like
    cfg.setWarmUpConfiguration(new LoadEverythingWarmupConfiguration());
Ignite tries to map the LoadEverythingWarmupConfiguration to an implementations it knows - either to a built-in one or to a plugin.
If it finds the implementation, it passes the configuration to it and it handles the warmup.
If it doesn't find an existing implementation, it throws an exception.
The implementation will use any internal API of Ignite that it needs to perform the warmup. It is up to the plugin maintainer to track code changes in Ignite and adjust the plugin to be compatible from version to version.
Is all of the above correct?
How about
     cfg.setWarmUpConfiguration(IgniteWarmupStrategies.LOAD_EVERYTHING);
instead of
     cfg.setWarmUpConfiguration(new LoadEverythingWarmupConfiguration());
?
Or do we expect having POJO instead of a string constant to be beneficial?

Agree about preloadPartition(). Fair enough, let's leave it be.


On IgniteConfiguration vs DataRegionConfiguration.
I like DataRegionConfiguration more because it allows to specify different strategies for different regions naturally.
We already say that all cache groups in the same region share memory management (e.g. share space and participate in page replacement together).
So it's natural to say that if I want different memory warmup semantics for two groups then I should be putting them in different regions.
Do you see a good way to have distinct warmup configuration for different regions while the config is on IgniteConfiguration level?

Thanks,
Stan

> On 6 Aug 2020, at 15:39, ткаленко кирилл <[hidden email]> wrote:
>
> Hello, Alexey!
>
> Your comments are fair.
>
> 05.08.2020, 15:51, "Alexey Goncharuk" <[hidden email]>:
>> Kirill,
>>
>> Thank you for driving this discussion and implementation.
>>
>> A few points from my side:
>> * Agree that it will be best to keep the strategy interface private because
>> it will be very dependent on the persistent storage implementation. We
>> would need to expose page IDs and types to public API, which is very
>> restrictive. The configuration part obviously needs to be public, and
>> ability to pull the strategy implementation from plugin is a good idea.
>> * I was also thinking of adding the warmup configuration straight to the
>> IgniteConfiguration, but I like Stan's idea of adding it to
>> DataRegionConfiguration. No strong preference here.
>> * I do not think we need to deprecate preloadPartition() method. One of the
>> use-cases for this method was to process partitions sequentially while a
>> node is running. This method is able to fetch the partition from disk much
>> (from times to orders of magnitude) faster than sequential scan.
>> * Being able to cancel the warmup during startup is a great feature. We
>> should be able to support it from control.sh because the warmup runs before
>> discovery which starts the last, so the control.sh handler should be
>> already running.

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Cache warmup

ткаленко кирилл
Hi, Stan!

Yes, that's right.
> On the interface structure.
> So, as a user I'll see one interface, WarmUpConfiguration, with no methods.
> I choose an implementation and configure it like
>     cfg.setWarmUpConfiguration(new LoadEverythingWarmupConfiguration());
> Ignite tries to map the LoadEverythingWarmupConfiguration to an implementations it knows - either to a built-in one or to a plugin.
> If it finds the implementation, it passes the configuration to it and it handles the warmup.
> If it doesn't find an existing implementation, it throws an exception.
> The implementation will use any internal API of Ignite that it needs to perform the warmup. It is up to the plugin maintainer to track code changes in Ignite and adjust the plugin to be compatible from version to version.
> Is all of the above correct?

If we need to configure heating, a string constant will not work for us.
> How about
>      cfg.setWarmUpConfiguration(IgniteWarmupStrategies.LOAD_EVERYTHING);
> instead of
>      cfg.setWarmUpConfiguration(new LoadEverythingWarmupConfiguration());
> ?
> Or do we expect having POJO instead of a string constant to be beneficial?

It seems to me that if you need to specify regions for a strategy, you can add them to strategy settings. It would be more natural for user to do configuration by caches, but due to internal feature of Ignite, it just doesn't work(yet), so it seems to me that you can add additional options for each setting.

06.08.2020, 20:41, "Stanislav Lukyanov" <[hidden email]>:

> Hi Kirill, Alexey,
>
> On the interface structure.
> So, as a user I'll see one interface, WarmUpConfiguration, with no methods.
> I choose an implementation and configure it like
>     cfg.setWarmUpConfiguration(new LoadEverythingWarmupConfiguration());
> Ignite tries to map the LoadEverythingWarmupConfiguration to an implementations it knows - either to a built-in one or to a plugin.
> If it finds the implementation, it passes the configuration to it and it handles the warmup.
> If it doesn't find an existing implementation, it throws an exception.
> The implementation will use any internal API of Ignite that it needs to perform the warmup. It is up to the plugin maintainer to track code changes in Ignite and adjust the plugin to be compatible from version to version.
> Is all of the above correct?
> How about
>      cfg.setWarmUpConfiguration(IgniteWarmupStrategies.LOAD_EVERYTHING);
> instead of
>      cfg.setWarmUpConfiguration(new LoadEverythingWarmupConfiguration());
> ?
> Or do we expect having POJO instead of a string constant to be beneficial?
>
> Agree about preloadPartition(). Fair enough, let's leave it be.
>
> On IgniteConfiguration vs DataRegionConfiguration.
> I like DataRegionConfiguration more because it allows to specify different strategies for different regions naturally.
> We already say that all cache groups in the same region share memory management (e.g. share space and participate in page replacement together).
> So it's natural to say that if I want different memory warmup semantics for two groups then I should be putting them in different regions.
> Do you see a good way to have distinct warmup configuration for different regions while the config is on IgniteConfiguration level?
>
> Thanks,
> Stan
>
>>  On 6 Aug 2020, at 15:39, ткаленко кирилл <[hidden email]> wrote:
>>
>>  Hello, Alexey!
>>
>>  Your comments are fair.
>>
>>  05.08.2020, 15:51, "Alexey Goncharuk" <[hidden email]>:
>>>  Kirill,
>>>
>>>  Thank you for driving this discussion and implementation.
>>>
>>>  A few points from my side:
>>>  * Agree that it will be best to keep the strategy interface private because
>>>  it will be very dependent on the persistent storage implementation. We
>>>  would need to expose page IDs and types to public API, which is very
>>>  restrictive. The configuration part obviously needs to be public, and
>>>  ability to pull the strategy implementation from plugin is a good idea.
>>>  * I was also thinking of adding the warmup configuration straight to the
>>>  IgniteConfiguration, but I like Stan's idea of adding it to
>>>  DataRegionConfiguration. No strong preference here.
>>>  * I do not think we need to deprecate preloadPartition() method. One of the
>>>  use-cases for this method was to process partitions sequentially while a
>>>  node is running. This method is able to fetch the partition from disk much
>>>  (from times to orders of magnitude) faster than sequential scan.
>>>  * Being able to cancel the warmup during startup is a great feature. We
>>>  should be able to support it from control.sh because the warmup runs before
>>>  discovery which starts the last, so the control.sh handler should be
>>>  already running.
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Cache warmup

ткаленко кирилл
In reply to this post by ткаленко кирилл
Hi, Stan!

As a result of personal correspondence I realized that you are right about making changes:
1)Move warm-up configuration to org.apache.ignite.configuration.DataRegionConfiguration#setWarmUpConfiguration;
2)Start warming up for each region sequentially;
3)Improving warm-up interface:

package org.apache.ignite.internal.processors.cache.warmup;

import org.apache.ignite.IgniteCheckedException;
import org.apache.ignite.configuration.WarmUpConfiguration;
import org.apache.ignite.internal.GridKernalContext;
import org.apache.ignite.internal.processors.cache.persistence.DataRegion;

/**
 * Interface for warming up.
 */
public interface WarmUpStrategy<T extends WarmUpConfiguration> {
    /**
     * Returns configuration class for mapping to strategy.
     *
     * @return Configuration class.
     */
    Class<T> configClass();

    /**
     * Warm up.
     *
     * @param kernalCtx Kernal context.
     * @param cfg       Warm-up configuration.
     * @param region    Data region.
     * @throws IgniteCheckedException if faild.
     */
    void warmUp(GridKernalContext kernalCtx, T cfg, DataRegion region) throws IgniteCheckedException;

    /**
     * Closing warm up.
     *
     * @throws IgniteCheckedException if faild.
     */
    void close() throws IgniteCheckedException;
}

4)Add a command to "control.sh", to stop current warm-up and cancel all others: --warm-up stop
5)The "load all" strategy will work as long as there is enough RAM and index pages will also take priority.

04.08.2020, 13:29, "ткаленко кирилл" <[hidden email]>:

> Hi, Slava!
>
> Thank you for looking at the offer and making fair comments.
>
> I personally discussed with Anton and Alexey because they are author and sponsor of "IEP-40" and we found out that point 2 in it is no longer relevant and it can be removed.
> I suggest implementing point 3, since it may be independent of point 1. Also, the warm-up will always start after restore phase, without subscribing to events.
>
> You are right this should be mentioned in the documentation and javadoc.
>>  This means that the user's thread, which starts Ignite via
>>  Ignition.start(), will wait for ana additional step - cache warm-up.
>>  I think this fact has to be clearly mentioned in our documentation (at
>>  Javadocat least) because this step can be time-consuming.
>
> My suggestion for implementation:
> 1)Adding a marker interface "org.apache.ignite.configuration.WarmUpConfiguration" for configuring cache warming;
> 2)Set only one configuration via "org.apache.ignite.configuration.IgniteConfiguration#setWarmUpConfiguration";
> 3)Add an internal warm-up interface that will start in [1] after [2];
>
> package org.apache.ignite.internal.processors.cache.warmup;
>
> import org.apache.ignite.IgniteCheckedException;
> import org.apache.ignite.configuration.WarmUpConfiguration;
> import org.apache.ignite.internal.GridKernalContext;
>
> /**
>  * Interface for warming up.
>  */
> public interface WarmUpStrategy<T extends WarmUpConfiguration> {
>     /**
>      * Returns configuration class for mapping to strategy.
>      *
>      * @return Configuration class.
>      */
>     Class<T> configClass();
>
>     /**
>      * Warm up.
>      *
>      * @param kernalCtx Kernal context.
>      * @param cfg Warm-up configuration.
>      * @throws IgniteCheckedException if faild.
>      */
>     void warmUp(GridKernalContext kernalCtx, T cfg) throws IgniteCheckedException;
> }
>
> 4)Adding an internal plugin extension for add own strategies;
>
> package org.apache.ignite.internal.processors.cache.warmup;
>
> import java.util.Collection;
> import org.apache.ignite.plugin.Extension;
>
> /**
>  * Interface for getting warm-up strategies from plugins.
>  */
> public interface WarmUpStrategySupplier extends Extension {
>     /**
>      * Getting warm-up strategies.
>      *
>      * @return Warm-up strategies.
>      */
>     Collection<WarmUpStrategy> strategies();
> }
>
> 5)Add a "Load all" strategy that will load everything to memory as long as there is space in it. This strategy is suitable if the persistent storage is less than RAM.
>
> Any objections or comments?
>
> [1] - org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#afterLogicalUpdatesApplied
> [2] - org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#restorePartitionStates
>
> 27.07.2020, 16:48, "Вячеслав Коптилин" <[hidden email]>:
>>  Hello Kirill,
>>
>>  Thanks a lot for driving this activity. If I am not mistaken, this
>>  discussion relates to IEP-40.
>>
>>>   I suggest adding a warmup phase after recovery here [1] after [2], before
>>
>>  discovery.
>>  This means that the user's thread, which starts Ignite via
>>  Ignition.start(), will wait for ana additional step - cache warm-up.
>>  I think this fact has to be clearly mentioned in our documentation (at
>>  Javadocat least) because this step can be time-consuming.
>>
>>>   I suggest adding a new interface:
>>
>>  I would change it a bit. First of all, it would be nice to place this
>>  interface to a public package and get rid of using GridCacheContext,
>>  which is an internal class and it should not leak to the public API in any
>>  case.
>>  Perhaps, this parameter is not needed at all or we should add some public
>>  abstraction instead of internal class.
>>
>>  package org.apache.ignite.configuration;
>>
>>  import org.apache.ignite.IgniteCheckedException;
>>  import org.apache.ignite.lang.IgniteFuture;
>>
>>  public interface CacheWarmupper {
>>      /**
>>       * Warmup cache.
>>       *
>>       * @param cachename Cache name.
>>       * @return Future cache warmup.
>>       * @throws IgniteCheckedException If failed.
>>       */
>>      IgniteFuture<?> warmup(String cachename) throws IgniteCheckedException;
>>  }
>>
>>  Thanks,
>>  S.
>>
>>  пн, 27 июл. 2020 г. в 15:03, ткаленко кирилл <[hidden email]>:
>>
>>>   Now, after restarting node, we have only cold caches, which at first
>>>   requests to them will gradually load data from disks, which can slow down
>>>   first calls to them.
>>>   If node has more RAM than data on disk, then they can be loaded at start
>>>   "warmup", thereby solving the issue of slowdowns during first calls to
>>>   caches.
>>>
>>>   I suggest adding a warmup phase after recovery here [1] after [2], before
>>>   descovery.
>>>
>>>   I suggest adding a new interface:
>>>
>>>   package org.apache.ignite.internal.processors.cache;
>>>
>>>   import org.apache.ignite.IgniteCheckedException;
>>>   import org.apache.ignite.internal.IgniteInternalFuture;
>>>   import org.jetbrains.annotations.Nullable;
>>>
>>>   /**
>>>    * Interface for warming up cache.
>>>    */
>>>   public interface CacheWarmup {
>>>       /**
>>>        * Warmup cache.
>>>        *
>>>        * @param cacheCtx Cache context.
>>>        * @return Future cache warmup.
>>>        * @throws IgniteCheckedException if failed.
>>>        */
>>>       @Nullable IgniteInternalFuture<?> process(GridCacheContext cacheCtx)
>>>   throws IgniteCheckedException;
>>>   }
>>>
>>>   Which will allow to warm up caches in parallel and asynchronously. Warmup
>>>   phase will end after all IgniteInternalFuture for all caches isDone.
>>>
>>>   Also adding the ability to customize via methods:
>>>   org.apache.ignite.configuration.IgniteConfiguration#setDefaultCacheWarmup
>>>   org.apache.ignite.configuration.CacheConfiguration#setCacheWarmup
>>>
>>>   Which will allow for each cache to set implementation of cache warming up,
>>>   both for a specific cache, and for all if necessary.
>>>
>>>   I suggest adding an implementation of SequentialWarmup that will use [3].
>>>
>>>   Questions, suggestions, comments?
>>>
>>>   [1] -
>>>   org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#afterLogicalUpdatesApplied
>>>   [2] -
>>>   org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#restorePartitionStates
>>>   [3] -
>>>   org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManager.CacheDataStore#preload
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Cache warmup

ткаленко кирилл
Hi, Stan again :-)

I suggest adding a little more flexibility to configuration:
1)Add default warm-up configuration for all regions into org.apache.ignite.configuration.DataStorageConfiguration#setDefaultWarmUpConfiguration
2)Add a NoOp strategy for turning off heating of a specific region

Thus, when starting warm-up, region configuration is taken at beginning, and if it is not present, it is taken from default. And if we don't want to warm up region, we set NoOp.

10.08.2020, 10:20, "ткаленко кирилл" <[hidden email]>:

> Hi, Stan!
>
> As a result of personal correspondence I realized that you are right about making changes:
> 1)Move warm-up configuration to org.apache.ignite.configuration.DataRegionConfiguration#setWarmUpConfiguration;
> 2)Start warming up for each region sequentially;
> 3)Improving warm-up interface:
>
> package org.apache.ignite.internal.processors.cache.warmup;
>
> import org.apache.ignite.IgniteCheckedException;
> import org.apache.ignite.configuration.WarmUpConfiguration;
> import org.apache.ignite.internal.GridKernalContext;
> import org.apache.ignite.internal.processors.cache.persistence.DataRegion;
>
> /**
>  * Interface for warming up.
>  */
> public interface WarmUpStrategy<T extends WarmUpConfiguration> {
>     /**
>      * Returns configuration class for mapping to strategy.
>      *
>      * @return Configuration class.
>      */
>     Class<T> configClass();
>
>     /**
>      * Warm up.
>      *
>      * @param kernalCtx Kernal context.
>      * @param cfg Warm-up configuration.
>      * @param region Data region.
>      * @throws IgniteCheckedException if faild.
>      */
>     void warmUp(GridKernalContext kernalCtx, T cfg, DataRegion region) throws IgniteCheckedException;
>
>     /**
>      * Closing warm up.
>      *
>      * @throws IgniteCheckedException if faild.
>      */
>     void close() throws IgniteCheckedException;
> }
>
> 4)Add a command to "control.sh", to stop current warm-up and cancel all others: --warm-up stop
> 5)The "load all" strategy will work as long as there is enough RAM and index pages will also take priority.
>
> 04.08.2020, 13:29, "ткаленко кирилл" <[hidden email]>:
>>  Hi, Slava!
>>
>>  Thank you for looking at the offer and making fair comments.
>>
>>  I personally discussed with Anton and Alexey because they are author and sponsor of "IEP-40" and we found out that point 2 in it is no longer relevant and it can be removed.
>>  I suggest implementing point 3, since it may be independent of point 1. Also, the warm-up will always start after restore phase, without subscribing to events.
>>
>>  You are right this should be mentioned in the documentation and javadoc.
>>>   This means that the user's thread, which starts Ignite via
>>>   Ignition.start(), will wait for ana additional step - cache warm-up.
>>>   I think this fact has to be clearly mentioned in our documentation (at
>>>   Javadocat least) because this step can be time-consuming.
>>
>>  My suggestion for implementation:
>>  1)Adding a marker interface "org.apache.ignite.configuration.WarmUpConfiguration" for configuring cache warming;
>>  2)Set only one configuration via "org.apache.ignite.configuration.IgniteConfiguration#setWarmUpConfiguration";
>>  3)Add an internal warm-up interface that will start in [1] after [2];
>>
>>  package org.apache.ignite.internal.processors.cache.warmup;
>>
>>  import org.apache.ignite.IgniteCheckedException;
>>  import org.apache.ignite.configuration.WarmUpConfiguration;
>>  import org.apache.ignite.internal.GridKernalContext;
>>
>>  /**
>>   * Interface for warming up.
>>   */
>>  public interface WarmUpStrategy<T extends WarmUpConfiguration> {
>>      /**
>>       * Returns configuration class for mapping to strategy.
>>       *
>>       * @return Configuration class.
>>       */
>>      Class<T> configClass();
>>
>>      /**
>>       * Warm up.
>>       *
>>       * @param kernalCtx Kernal context.
>>       * @param cfg Warm-up configuration.
>>       * @throws IgniteCheckedException if faild.
>>       */
>>      void warmUp(GridKernalContext kernalCtx, T cfg) throws IgniteCheckedException;
>>  }
>>
>>  4)Adding an internal plugin extension for add own strategies;
>>
>>  package org.apache.ignite.internal.processors.cache.warmup;
>>
>>  import java.util.Collection;
>>  import org.apache.ignite.plugin.Extension;
>>
>>  /**
>>   * Interface for getting warm-up strategies from plugins.
>>   */
>>  public interface WarmUpStrategySupplier extends Extension {
>>      /**
>>       * Getting warm-up strategies.
>>       *
>>       * @return Warm-up strategies.
>>       */
>>      Collection<WarmUpStrategy> strategies();
>>  }
>>
>>  5)Add a "Load all" strategy that will load everything to memory as long as there is space in it. This strategy is suitable if the persistent storage is less than RAM.
>>
>>  Any objections or comments?
>>
>>  [1] - org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#afterLogicalUpdatesApplied
>>  [2] - org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#restorePartitionStates
>>
>>  27.07.2020, 16:48, "Вячеслав Коптилин" <[hidden email]>:
>>>   Hello Kirill,
>>>
>>>   Thanks a lot for driving this activity. If I am not mistaken, this
>>>   discussion relates to IEP-40.
>>>
>>>>    I suggest adding a warmup phase after recovery here [1] after [2], before
>>>
>>>   discovery.
>>>   This means that the user's thread, which starts Ignite via
>>>   Ignition.start(), will wait for ana additional step - cache warm-up.
>>>   I think this fact has to be clearly mentioned in our documentation (at
>>>   Javadocat least) because this step can be time-consuming.
>>>
>>>>    I suggest adding a new interface:
>>>
>>>   I would change it a bit. First of all, it would be nice to place this
>>>   interface to a public package and get rid of using GridCacheContext,
>>>   which is an internal class and it should not leak to the public API in any
>>>   case.
>>>   Perhaps, this parameter is not needed at all or we should add some public
>>>   abstraction instead of internal class.
>>>
>>>   package org.apache.ignite.configuration;
>>>
>>>   import org.apache.ignite.IgniteCheckedException;
>>>   import org.apache.ignite.lang.IgniteFuture;
>>>
>>>   public interface CacheWarmupper {
>>>       /**
>>>        * Warmup cache.
>>>        *
>>>        * @param cachename Cache name.
>>>        * @return Future cache warmup.
>>>        * @throws IgniteCheckedException If failed.
>>>        */
>>>       IgniteFuture<?> warmup(String cachename) throws IgniteCheckedException;
>>>   }
>>>
>>>   Thanks,
>>>   S.
>>>
>>>   пн, 27 июл. 2020 г. в 15:03, ткаленко кирилл <[hidden email]>:
>>>
>>>>    Now, after restarting node, we have only cold caches, which at first
>>>>    requests to them will gradually load data from disks, which can slow down
>>>>    first calls to them.
>>>>    If node has more RAM than data on disk, then they can be loaded at start
>>>>    "warmup", thereby solving the issue of slowdowns during first calls to
>>>>    caches.
>>>>
>>>>    I suggest adding a warmup phase after recovery here [1] after [2], before
>>>>    descovery.
>>>>
>>>>    I suggest adding a new interface:
>>>>
>>>>    package org.apache.ignite.internal.processors.cache;
>>>>
>>>>    import org.apache.ignite.IgniteCheckedException;
>>>>    import org.apache.ignite.internal.IgniteInternalFuture;
>>>>    import org.jetbrains.annotations.Nullable;
>>>>
>>>>    /**
>>>>     * Interface for warming up cache.
>>>>     */
>>>>    public interface CacheWarmup {
>>>>        /**
>>>>         * Warmup cache.
>>>>         *
>>>>         * @param cacheCtx Cache context.
>>>>         * @return Future cache warmup.
>>>>         * @throws IgniteCheckedException if failed.
>>>>         */
>>>>        @Nullable IgniteInternalFuture<?> process(GridCacheContext cacheCtx)
>>>>    throws IgniteCheckedException;
>>>>    }
>>>>
>>>>    Which will allow to warm up caches in parallel and asynchronously. Warmup
>>>>    phase will end after all IgniteInternalFuture for all caches isDone.
>>>>
>>>>    Also adding the ability to customize via methods:
>>>>    org.apache.ignite.configuration.IgniteConfiguration#setDefaultCacheWarmup
>>>>    org.apache.ignite.configuration.CacheConfiguration#setCacheWarmup
>>>>
>>>>    Which will allow for each cache to set implementation of cache warming up,
>>>>    both for a specific cache, and for all if necessary.
>>>>
>>>>    I suggest adding an implementation of SequentialWarmup that will use [3].
>>>>
>>>>    Questions, suggestions, comments?
>>>>
>>>>    [1] -
>>>>    org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#afterLogicalUpdatesApplied
>>>>    [2] -
>>>>    org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#restorePartitionStates
>>>>    [3] -
>>>>    org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManager.CacheDataStore#preload
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Cache warmup

Stanislav Lukyanov
All of this looks awesome, covers the use cases I know about.
Thanks!

Stan

> On 10 Aug 2020, at 15:39, ткаленко кирилл <[hidden email]> wrote:
>
> Hi, Stan again :-)
>
> I suggest adding a little more flexibility to configuration:
> 1)Add default warm-up configuration for all regions into org.apache.ignite.configuration.DataStorageConfiguration#setDefaultWarmUpConfiguration
> 2)Add a NoOp strategy for turning off heating of a specific region
>
> Thus, when starting warm-up, region configuration is taken at beginning, and if it is not present, it is taken from default. And if we don't want to warm up region, we set NoOp.
>
> 10.08.2020, 10:20, "ткаленко кирилл" <[hidden email]>:
>> Hi, Stan!
>>
>> As a result of personal correspondence I realized that you are right about making changes:
>> 1)Move warm-up configuration to org.apache.ignite.configuration.DataRegionConfiguration#setWarmUpConfiguration;
>> 2)Start warming up for each region sequentially;
>> 3)Improving warm-up interface:
>>
>> package org.apache.ignite.internal.processors.cache.warmup;
>>
>> import org.apache.ignite.IgniteCheckedException;
>> import org.apache.ignite.configuration.WarmUpConfiguration;
>> import org.apache.ignite.internal.GridKernalContext;
>> import org.apache.ignite.internal.processors.cache.persistence.DataRegion;
>>
>> /**
>>  * Interface for warming up.
>>  */
>> public interface WarmUpStrategy<T extends WarmUpConfiguration> {
>>     /**
>>      * Returns configuration class for mapping to strategy.
>>      *
>>      * @return Configuration class.
>>      */
>>     Class<T> configClass();
>>
>>     /**
>>      * Warm up.
>>      *
>>      * @param kernalCtx Kernal context.
>>      * @param cfg Warm-up configuration.
>>      * @param region Data region.
>>      * @throws IgniteCheckedException if faild.
>>      */
>>     void warmUp(GridKernalContext kernalCtx, T cfg, DataRegion region) throws IgniteCheckedException;
>>
>>     /**
>>      * Closing warm up.
>>      *
>>      * @throws IgniteCheckedException if faild.
>>      */
>>     void close() throws IgniteCheckedException;
>> }
>>
>> 4)Add a command to "control.sh", to stop current warm-up and cancel all others: --warm-up stop
>> 5)The "load all" strategy will work as long as there is enough RAM and index pages will also take priority.
>>
>> 04.08.2020, 13:29, "ткаленко кирилл" <[hidden email]>:
>>>  Hi, Slava!
>>>
>>>  Thank you for looking at the offer and making fair comments.
>>>
>>>  I personally discussed with Anton and Alexey because they are author and sponsor of "IEP-40" and we found out that point 2 in it is no longer relevant and it can be removed.
>>>  I suggest implementing point 3, since it may be independent of point 1. Also, the warm-up will always start after restore phase, without subscribing to events.
>>>
>>>  You are right this should be mentioned in the documentation and javadoc.
>>>>   This means that the user's thread, which starts Ignite via
>>>>   Ignition.start(), will wait for ana additional step - cache warm-up.
>>>>   I think this fact has to be clearly mentioned in our documentation (at
>>>>   Javadocat least) because this step can be time-consuming.
>>>
>>>  My suggestion for implementation:
>>>  1)Adding a marker interface "org.apache.ignite.configuration.WarmUpConfiguration" for configuring cache warming;
>>>  2)Set only one configuration via "org.apache.ignite.configuration.IgniteConfiguration#setWarmUpConfiguration";
>>>  3)Add an internal warm-up interface that will start in [1] after [2];
>>>
>>>  package org.apache.ignite.internal.processors.cache.warmup;
>>>
>>>  import org.apache.ignite.IgniteCheckedException;
>>>  import org.apache.ignite.configuration.WarmUpConfiguration;
>>>  import org.apache.ignite.internal.GridKernalContext;
>>>
>>>  /**
>>>   * Interface for warming up.
>>>   */
>>>  public interface WarmUpStrategy<T extends WarmUpConfiguration> {
>>>      /**
>>>       * Returns configuration class for mapping to strategy.
>>>       *
>>>       * @return Configuration class.
>>>       */
>>>      Class<T> configClass();
>>>
>>>      /**
>>>       * Warm up.
>>>       *
>>>       * @param kernalCtx Kernal context.
>>>       * @param cfg Warm-up configuration.
>>>       * @throws IgniteCheckedException if faild.
>>>       */
>>>      void warmUp(GridKernalContext kernalCtx, T cfg) throws IgniteCheckedException;
>>>  }
>>>
>>>  4)Adding an internal plugin extension for add own strategies;
>>>
>>>  package org.apache.ignite.internal.processors.cache.warmup;
>>>
>>>  import java.util.Collection;
>>>  import org.apache.ignite.plugin.Extension;
>>>
>>>  /**
>>>   * Interface for getting warm-up strategies from plugins.
>>>   */
>>>  public interface WarmUpStrategySupplier extends Extension {
>>>      /**
>>>       * Getting warm-up strategies.
>>>       *
>>>       * @return Warm-up strategies.
>>>       */
>>>      Collection<WarmUpStrategy> strategies();
>>>  }
>>>
>>>  5)Add a "Load all" strategy that will load everything to memory as long as there is space in it. This strategy is suitable if the persistent storage is less than RAM.
>>>
>>>  Any objections or comments?
>>>
>>>  [1] - org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#afterLogicalUpdatesApplied
>>>  [2] - org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#restorePartitionStates
>>>
>>>  27.07.2020, 16:48, "Вячеслав Коптилин" <[hidden email]>:
>>>>   Hello Kirill,
>>>>
>>>>   Thanks a lot for driving this activity. If I am not mistaken, this
>>>>   discussion relates to IEP-40.
>>>>
>>>>>    I suggest adding a warmup phase after recovery here [1] after [2], before
>>>>
>>>>   discovery.
>>>>   This means that the user's thread, which starts Ignite via
>>>>   Ignition.start(), will wait for ana additional step - cache warm-up.
>>>>   I think this fact has to be clearly mentioned in our documentation (at
>>>>   Javadocat least) because this step can be time-consuming.
>>>>
>>>>>    I suggest adding a new interface:
>>>>
>>>>   I would change it a bit. First of all, it would be nice to place this
>>>>   interface to a public package and get rid of using GridCacheContext,
>>>>   which is an internal class and it should not leak to the public API in any
>>>>   case.
>>>>   Perhaps, this parameter is not needed at all or we should add some public
>>>>   abstraction instead of internal class.
>>>>
>>>>   package org.apache.ignite.configuration;
>>>>
>>>>   import org.apache.ignite.IgniteCheckedException;
>>>>   import org.apache.ignite.lang.IgniteFuture;
>>>>
>>>>   public interface CacheWarmupper {
>>>>       /**
>>>>        * Warmup cache.
>>>>        *
>>>>        * @param cachename Cache name.
>>>>        * @return Future cache warmup.
>>>>        * @throws IgniteCheckedException If failed.
>>>>        */
>>>>       IgniteFuture<?> warmup(String cachename) throws IgniteCheckedException;
>>>>   }
>>>>
>>>>   Thanks,
>>>>   S.
>>>>
>>>>   пн, 27 июл. 2020 г. в 15:03, ткаленко кирилл <[hidden email]>:
>>>>
>>>>>    Now, after restarting node, we have only cold caches, which at first
>>>>>    requests to them will gradually load data from disks, which can slow down
>>>>>    first calls to them.
>>>>>    If node has more RAM than data on disk, then they can be loaded at start
>>>>>    "warmup", thereby solving the issue of slowdowns during first calls to
>>>>>    caches.
>>>>>
>>>>>    I suggest adding a warmup phase after recovery here [1] after [2], before
>>>>>    descovery.
>>>>>
>>>>>    I suggest adding a new interface:
>>>>>
>>>>>    package org.apache.ignite.internal.processors.cache;
>>>>>
>>>>>    import org.apache.ignite.IgniteCheckedException;
>>>>>    import org.apache.ignite.internal.IgniteInternalFuture;
>>>>>    import org.jetbrains.annotations.Nullable;
>>>>>
>>>>>    /**
>>>>>     * Interface for warming up cache.
>>>>>     */
>>>>>    public interface CacheWarmup {
>>>>>        /**
>>>>>         * Warmup cache.
>>>>>         *
>>>>>         * @param cacheCtx Cache context.
>>>>>         * @return Future cache warmup.
>>>>>         * @throws IgniteCheckedException if failed.
>>>>>         */
>>>>>        @Nullable IgniteInternalFuture<?> process(GridCacheContext cacheCtx)
>>>>>    throws IgniteCheckedException;
>>>>>    }
>>>>>
>>>>>    Which will allow to warm up caches in parallel and asynchronously. Warmup
>>>>>    phase will end after all IgniteInternalFuture for all caches isDone.
>>>>>
>>>>>    Also adding the ability to customize via methods:
>>>>>    org.apache.ignite.configuration.IgniteConfiguration#setDefaultCacheWarmup
>>>>>    org.apache.ignite.configuration.CacheConfiguration#setCacheWarmup
>>>>>
>>>>>    Which will allow for each cache to set implementation of cache warming up,
>>>>>    both for a specific cache, and for all if necessary.
>>>>>
>>>>>    I suggest adding an implementation of SequentialWarmup that will use [3].
>>>>>
>>>>>    Questions, suggestions, comments?
>>>>>
>>>>>    [1] -
>>>>>    org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#afterLogicalUpdatesApplied
>>>>>    [2] -
>>>>>    org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#restorePartitionStates
>>>>>    [3] -
>>>>>    org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManager.CacheDataStore#preload

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSSION] Cache warmup

ткаленко кирилл
Great! Then I proceed to ticket https://issues.apache.org/jira/browse/IGNITE-13345.

10.08.2020, 16:30, "Stanislav Lukyanov" <[hidden email]>:

> All of this looks awesome, covers the use cases I know about.
> Thanks!
>
> Stan
>
>>  On 10 Aug 2020, at 15:39, ткаленко кирилл <[hidden email]> wrote:
>>
>>  Hi, Stan again :-)
>>
>>  I suggest adding a little more flexibility to configuration:
>>  1)Add default warm-up configuration for all regions into org.apache.ignite.configuration.DataStorageConfiguration#setDefaultWarmUpConfiguration
>>  2)Add a NoOp strategy for turning off heating of a specific region
>>
>>  Thus, when starting warm-up, region configuration is taken at beginning, and if it is not present, it is taken from default. And if we don't want to warm up region, we set NoOp.
>>
>>  10.08.2020, 10:20, "ткаленко кирилл" <[hidden email]>:
>>>  Hi, Stan!
>>>
>>>  As a result of personal correspondence I realized that you are right about making changes:
>>>  1)Move warm-up configuration to org.apache.ignite.configuration.DataRegionConfiguration#setWarmUpConfiguration;
>>>  2)Start warming up for each region sequentially;
>>>  3)Improving warm-up interface:
>>>
>>>  package org.apache.ignite.internal.processors.cache.warmup;
>>>
>>>  import org.apache.ignite.IgniteCheckedException;
>>>  import org.apache.ignite.configuration.WarmUpConfiguration;
>>>  import org.apache.ignite.internal.GridKernalContext;
>>>  import org.apache.ignite.internal.processors.cache.persistence.DataRegion;
>>>
>>>  /**
>>>   * Interface for warming up.
>>>   */
>>>  public interface WarmUpStrategy<T extends WarmUpConfiguration> {
>>>      /**
>>>       * Returns configuration class for mapping to strategy.
>>>       *
>>>       * @return Configuration class.
>>>       */
>>>      Class<T> configClass();
>>>
>>>      /**
>>>       * Warm up.
>>>       *
>>>       * @param kernalCtx Kernal context.
>>>       * @param cfg Warm-up configuration.
>>>       * @param region Data region.
>>>       * @throws IgniteCheckedException if faild.
>>>       */
>>>      void warmUp(GridKernalContext kernalCtx, T cfg, DataRegion region) throws IgniteCheckedException;
>>>
>>>      /**
>>>       * Closing warm up.
>>>       *
>>>       * @throws IgniteCheckedException if faild.
>>>       */
>>>      void close() throws IgniteCheckedException;
>>>  }
>>>
>>>  4)Add a command to "control.sh", to stop current warm-up and cancel all others: --warm-up stop
>>>  5)The "load all" strategy will work as long as there is enough RAM and index pages will also take priority.
>>>
>>>  04.08.2020, 13:29, "ткаленко кирилл" <[hidden email]>:
>>>>   Hi, Slava!
>>>>
>>>>   Thank you for looking at the offer and making fair comments.
>>>>
>>>>   I personally discussed with Anton and Alexey because they are author and sponsor of "IEP-40" and we found out that point 2 in it is no longer relevant and it can be removed.
>>>>   I suggest implementing point 3, since it may be independent of point 1. Also, the warm-up will always start after restore phase, without subscribing to events.
>>>>
>>>>   You are right this should be mentioned in the documentation and javadoc.
>>>>>    This means that the user's thread, which starts Ignite via
>>>>>    Ignition.start(), will wait for ana additional step - cache warm-up.
>>>>>    I think this fact has to be clearly mentioned in our documentation (at
>>>>>    Javadocat least) because this step can be time-consuming.
>>>>
>>>>   My suggestion for implementation:
>>>>   1)Adding a marker interface "org.apache.ignite.configuration.WarmUpConfiguration" for configuring cache warming;
>>>>   2)Set only one configuration via "org.apache.ignite.configuration.IgniteConfiguration#setWarmUpConfiguration";
>>>>   3)Add an internal warm-up interface that will start in [1] after [2];
>>>>
>>>>   package org.apache.ignite.internal.processors.cache.warmup;
>>>>
>>>>   import org.apache.ignite.IgniteCheckedException;
>>>>   import org.apache.ignite.configuration.WarmUpConfiguration;
>>>>   import org.apache.ignite.internal.GridKernalContext;
>>>>
>>>>   /**
>>>>    * Interface for warming up.
>>>>    */
>>>>   public interface WarmUpStrategy<T extends WarmUpConfiguration> {
>>>>       /**
>>>>        * Returns configuration class for mapping to strategy.
>>>>        *
>>>>        * @return Configuration class.
>>>>        */
>>>>       Class<T> configClass();
>>>>
>>>>       /**
>>>>        * Warm up.
>>>>        *
>>>>        * @param kernalCtx Kernal context.
>>>>        * @param cfg Warm-up configuration.
>>>>        * @throws IgniteCheckedException if faild.
>>>>        */
>>>>       void warmUp(GridKernalContext kernalCtx, T cfg) throws IgniteCheckedException;
>>>>   }
>>>>
>>>>   4)Adding an internal plugin extension for add own strategies;
>>>>
>>>>   package org.apache.ignite.internal.processors.cache.warmup;
>>>>
>>>>   import java.util.Collection;
>>>>   import org.apache.ignite.plugin.Extension;
>>>>
>>>>   /**
>>>>    * Interface for getting warm-up strategies from plugins.
>>>>    */
>>>>   public interface WarmUpStrategySupplier extends Extension {
>>>>       /**
>>>>        * Getting warm-up strategies.
>>>>        *
>>>>        * @return Warm-up strategies.
>>>>        */
>>>>       Collection<WarmUpStrategy> strategies();
>>>>   }
>>>>
>>>>   5)Add a "Load all" strategy that will load everything to memory as long as there is space in it. This strategy is suitable if the persistent storage is less than RAM.
>>>>
>>>>   Any objections or comments?
>>>>
>>>>   [1] - org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#afterLogicalUpdatesApplied
>>>>   [2] - org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#restorePartitionStates
>>>>
>>>>   27.07.2020, 16:48, "Вячеслав Коптилин" <[hidden email]>:
>>>>>    Hello Kirill,
>>>>>
>>>>>    Thanks a lot for driving this activity. If I am not mistaken, this
>>>>>    discussion relates to IEP-40.
>>>>>
>>>>>>     I suggest adding a warmup phase after recovery here [1] after [2], before
>>>>>
>>>>>    discovery.
>>>>>    This means that the user's thread, which starts Ignite via
>>>>>    Ignition.start(), will wait for ana additional step - cache warm-up.
>>>>>    I think this fact has to be clearly mentioned in our documentation (at
>>>>>    Javadocat least) because this step can be time-consuming.
>>>>>
>>>>>>     I suggest adding a new interface:
>>>>>
>>>>>    I would change it a bit. First of all, it would be nice to place this
>>>>>    interface to a public package and get rid of using GridCacheContext,
>>>>>    which is an internal class and it should not leak to the public API in any
>>>>>    case.
>>>>>    Perhaps, this parameter is not needed at all or we should add some public
>>>>>    abstraction instead of internal class.
>>>>>
>>>>>    package org.apache.ignite.configuration;
>>>>>
>>>>>    import org.apache.ignite.IgniteCheckedException;
>>>>>    import org.apache.ignite.lang.IgniteFuture;
>>>>>
>>>>>    public interface CacheWarmupper {
>>>>>        /**
>>>>>         * Warmup cache.
>>>>>         *
>>>>>         * @param cachename Cache name.
>>>>>         * @return Future cache warmup.
>>>>>         * @throws IgniteCheckedException If failed.
>>>>>         */
>>>>>        IgniteFuture<?> warmup(String cachename) throws IgniteCheckedException;
>>>>>    }
>>>>>
>>>>>    Thanks,
>>>>>    S.
>>>>>
>>>>>    пн, 27 июл. 2020 г. в 15:03, ткаленко кирилл <[hidden email]>:
>>>>>
>>>>>>     Now, after restarting node, we have only cold caches, which at first
>>>>>>     requests to them will gradually load data from disks, which can slow down
>>>>>>     first calls to them.
>>>>>>     If node has more RAM than data on disk, then they can be loaded at start
>>>>>>     "warmup", thereby solving the issue of slowdowns during first calls to
>>>>>>     caches.
>>>>>>
>>>>>>     I suggest adding a warmup phase after recovery here [1] after [2], before
>>>>>>     descovery.
>>>>>>
>>>>>>     I suggest adding a new interface:
>>>>>>
>>>>>>     package org.apache.ignite.internal.processors.cache;
>>>>>>
>>>>>>     import org.apache.ignite.IgniteCheckedException;
>>>>>>     import org.apache.ignite.internal.IgniteInternalFuture;
>>>>>>     import org.jetbrains.annotations.Nullable;
>>>>>>
>>>>>>     /**
>>>>>>      * Interface for warming up cache.
>>>>>>      */
>>>>>>     public interface CacheWarmup {
>>>>>>         /**
>>>>>>          * Warmup cache.
>>>>>>          *
>>>>>>          * @param cacheCtx Cache context.
>>>>>>          * @return Future cache warmup.
>>>>>>          * @throws IgniteCheckedException if failed.
>>>>>>          */
>>>>>>         @Nullable IgniteInternalFuture<?> process(GridCacheContext cacheCtx)
>>>>>>     throws IgniteCheckedException;
>>>>>>     }
>>>>>>
>>>>>>     Which will allow to warm up caches in parallel and asynchronously. Warmup
>>>>>>     phase will end after all IgniteInternalFuture for all caches isDone.
>>>>>>
>>>>>>     Also adding the ability to customize via methods:
>>>>>>     org.apache.ignite.configuration.IgniteConfiguration#setDefaultCacheWarmup
>>>>>>     org.apache.ignite.configuration.CacheConfiguration#setCacheWarmup
>>>>>>
>>>>>>     Which will allow for each cache to set implementation of cache warming up,
>>>>>>     both for a specific cache, and for all if necessary.
>>>>>>
>>>>>>     I suggest adding an implementation of SequentialWarmup that will use [3].
>>>>>>
>>>>>>     Questions, suggestions, comments?
>>>>>>
>>>>>>     [1] -
>>>>>>     org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#afterLogicalUpdatesApplied
>>>>>>     [2] -
>>>>>>     org.apache.ignite.internal.processors.cache.GridCacheProcessor.CacheRecoveryLifecycle#restorePartitionStates
>>>>>>     [3] -
>>>>>>     org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManager.CacheDataStore#preload