fluid.clip¶
ErrorClipByValue¶
-
class
paddle.fluid.clip.
ErrorClipByValue
(max, min=None)[source] Clips tensor values to the range [min, max].
Given a tensor t, this operation clips its value to min and max inplace.
Any values less than min are set to min.
Any values greater than max are set to max.
- Parameters
max (float) – The maximum value to clip by.
min (float, optional) – The minimum value to clip by. if not set by user, will be set to -max by framework.
Examples
import paddle.fluid as fluid BATCH_SIZE = 128 CLIP_MAX = 2e-6 CLIP_MIN = -1e-6 prog = fluid.framework.Program() with fluid.program_guard(main_program=prog): image = fluid.layers.data(name='x', shape=[784], dtype='float32') hidden1 = fluid.layers.fc(input=image, size=128, act='relu') hidden2 = fluid.layers.fc(input=hidden1, size=64, act='relu') predict = fluid.layers.fc(input=hidden2, size=10, act='softmax') label = fluid.layers.data(name='y', shape=[1], dtype='int64') cost = fluid.layers.cross_entropy(input=predict, label=label) avg_cost = fluid.layers.mean(cost) prog_clip = prog.clone() prog_clip.block(0).var(hidden1.name)._set_error_clip( fluid.clip.ErrorClipByValue( max=CLIP_MAX, min=CLIP_MIN)
GradientClipByGlobalNorm¶
-
class
paddle.fluid.clip.
GradientClipByGlobalNorm
(clip_norm, group_name='default_group')[source] Clips values of multiple tensors by the ratio of the sum of their norms.
Given a list of tensors t_list, and a clipping ratio clip_norm, this operation returns a list of clipped tensors list_clipped and the global norm (global_norm) of all tensors in t_list.
To perform the clipping, the values \(t\_list[i]\) are set to:
\[t\_list[i] = t\_list[i] * \frac{clip\_norm}{\max(global\_norm, clip\_norm)}\]where:
\[global\_norm = \sqrt{\sum_{i=0}^{N-1}(l2norm(t\_list[i]))^2}\]If \(clip\_norm > global\_norm\) then the entries in t_list remain as they are, otherwise they’re all shrunk by the global ratio.
- Parameters
clip_norm (float) – The maximum norm value
group_name (str, optional) – The group name for this clip.
Examples
import paddle.fluid as fluid prog = fluid.framework.Program() startup_program = fluid.framework.Program() with fluid.program_guard( main_program=prog, startup_program=startup_program): image = fluid.layers.data(name='x', shape=[784], dtype='float32') label = fluid.layers.data(name='y', shape=[1], dtype='int64') hidden1 = fluid.layers.fc(input=image, size=128, act='relu') hidden2 = fluid.layers.fc(input=hidden1, size=64, act='relu') predict = fluid.layers.fc(input=hidden2, size=10, act='softmax') cost = fluid.layers.cross_entropy(input=predict, label=label) avg_cost = fluid.layers.mean(cost) prog_clip = prog.clone() avg_cost_clip = prog_clip.block(0).var(avg_cost.name) p_g_clip = fluid.backward.append_backward(loss=avg_cost_clip) with fluid.program_guard(main_program=prog_clip): fluid.clip.set_gradient_clip( fluid.clip.GradientClipByGlobalNorm(clip_norm=2.0)) p_g_clip = fluid.clip.append_gradient_clip_ops(p_g_clip)
GradientClipByNorm¶
-
class
paddle.fluid.clip.
GradientClipByNorm
(clip_norm)[source] Clips tensor values to a maximum L2-norm.
This operator limits the L2 norm of the input \(X\) within \(max\_norm\). If the L2 norm of \(X\) is less than or equal to \(max\_norm\), \(Out\) will be the same as \(X\). If the L2 norm of \(X\) is greater than \(max\_norm\), \(X\) will be linearly scaled to make the L2 norm of \(Out\) equal to \(max\_norm\), as shown in the following formula:
\[Out = \frac{max\_norm * X}{norm(X)},\]where \(norm(X)\) represents the L2 norm of \(X\).
- Parameters
clip_norm (float) – The maximum norm value
Examples
import paddle.fluid as fluid w_param_attrs = fluid.ParamAttr(name=None, initializer=fluid.initializer.UniformInitializer(low=-1.0, high=1.0, seed=0), learning_rate=1.0, regularizer=fluid.regularizer.L1Decay(1.0), trainable=True, gradient_clip=fluid.clip.GradientClipByNorm(clip_norm=2.0)) x = fluid.layers.data(name='x', shape=[10], dtype='float32') y_predict = fluid.layers.fc(input=x, size=1, param_attr=w_param_attrs)
GradientClipByValue¶
-
class
paddle.fluid.clip.
GradientClipByValue
(max, min=None)[source] Clips gradient values to the range [min, max].
Given a tensor t, this operation clips its value to min and max inplace.
Any values less than min are set to min.
Any values greater than max are set to max.
- Parameters
max (float) – The maximum value to clip by.
min (float, optional) – The minimum value to clip by. if not set by user, will be set to -max by framework.
Examples
import paddle.fluid as fluid w_param_attrs = fluid.ParamAttr(name=None, initializer=fluid.initializer.UniformInitializer(low=-1.0, high=1.0, seed=0), learning_rate=1.0, regularizer=fluid.regularizer.L1Decay(1.0), trainable=True, gradient_clip=fluid.clip.GradientClipByValue(-1.0, 1.0)) x = fluid.layers.data(name='x', shape=[10], dtype='float32') y_predict = fluid.layers.fc(input=x, size=1, param_attr=w_param_attrs)