Initializable _push_initialization_config

A frequent use case I have when using blocks involves creating some graph of bricks, (most of which recursive), and needing to fine tune only a few of the children bricks weights. The workflow suggested by the docs is to set the top level initialization and then call push_initialization_config then manually go in and set the required weights. This seems like spreading out the configuration across multiple different locations in a code base as well as leaving no easy way to create prefab bricks with initializations already set inside them. In addition to this, the write over nature seems unintuitive, as there is no warning when a user manually sets weights_init and sees no effect. For example:

class MultiBrick(Initializable):
    def __init__(self, **kwargs):
         self.lin1 = Linear(weights_init=Constant(1))
         self.lin2 = Linear()
b = MultiBrick(self, weights_init=Constant(0), bias=Constant(0))

I would expect b.lin1.weights_init to be Constant(1) but this is not the case. In this case its easy enough to set it manually post push_initialization_config but this is not true the larger the graphs get.

As a solution, could the Initializable._push_initialization_config the first check if the child.weights_init exists and if it does not overwrite and leave the original value? I quick look at the source says Initialization is the only class using _push_initialization_config.

Thanks for your consideration.


2020/11/26 21:53
  • 点赞
  • 收藏
  • 回答