-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Add the alpha dropout method #24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Maybe you can refer keras's And PyTorch Version here. |
|
How fantastic!!! Thank you~ |
|
Hi @BoltzmannZhaung - thanks for doing this, on first glance it looks great :) Will take a closer look soon. |
|
thank u~! |
| a = ((1 - self.p) * (1 + self.p * alpha_p ** 2)) ** -0.5 | ||
| b = -a * alpha_p * self.p | ||
|
|
||
| keep = self._greaterOrEqual(np.random.uniform(0,1,X.shape)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this run? It looks like you don't pass a y value to your _greaterOrEqual method.
In general, I don't think you need to create a separate method. A one-liner is still pretty clean (IMO): np.random.rand(*X.shape) >= self.p.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More generally, this function is very similar to the Keras implementation here.
I want to stress that all code in this repo should be your own work, or should properly cite the code it's based on.
| class AlphaDropout(WrapperBase): | ||
| def __init__(self, wrapped_layer, p): | ||
| """ | ||
| Alpha Dropout is a `Dropout` that aim at keeping mean and variance to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this description is copied from the Tensorflow documentation. I think it's better if we have documentation be more explicit + verbose for a project like this, since the expectation is that these will be "reference" implementations for people to use when trying to understand how the different model components work.
With that in mind, this should be rewritten. I'm happy to help proofread and revise grammar as you'd like :-)
| keep = self._greaterOrEqual(np.random.uniform(0,1,X.shape)) | ||
| x = X * keep + alpha_p * (1 - keep) | ||
| X = a * x + b | ||
| self._wrapper_derived_variables["AlphaDropout_mask"] = keep |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is inconsistent with dropout_mask in line 165. To maintain consistency with the regular Dropout wrapper, I'd recommend changing this to dropout_mask as well.
| self._wrapper_derived_variables["AlphaDropout_mask"] = keep | ||
| return scaler * self._base_layer.forward(X) | ||
|
|
||
| def backward(self, dLdy): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not correct for AlphaDropout, as X is not scaled by 1 / 1-p - it's a function of alpha, scale, and p now.
|
Hi @BoltzmannZhaung - sorry for the delay. I just did a review pass and left some specific in-line comments. In addition to these, there are two general points I want to emphasize:
I recognize that for simple functions it can be difficult to implement them cleanly in a way that doesn't resemble existing implementations. This is okay, and I don't expect you to be able to write 100% unique code. What I want, however, is that your code represent your efforts, rather than someone else's :) |
|
@ddbourgin |
Dear Post Dr.Bourgin,
I wanna give a perfect method code, however , unfortunately may be it can not work well (I have not test the method).So.....uh~~ may I ask your help to finish it, then get the honor that be a contributor of this great project (5k+ stars)....
THANK YOU!