热门
最新
红包
立Flag
投票
同城
我的
发布
@Haulyn5:Transformer 里的 Position-wise Feed-Forward Networks 包括 two linear transformations 和一个 relu 激活函数。最后 dropout。
来源:https://nlp.seas.harvard.edu/annotated-transformer/#encoder-and-decoder-stacks
```
class PositionwiseFeedForward(nn.Module):
"Implements FFN equation."
def __init__(self, d_model, d_ff, dropout=0.1):
super(PositionwiseFeedForward, self).__init__()
self.w_1 = nn.Linear(d_model, d_ff)
self.w_2 = nn.Linear(d_ff, d_model)
self.dropout = nn.Dropout(dropout)
def forward(self, x):
return self.w_2(self.dropout(self.w_1(x).relu()))
```
CSDN App 扫码分享
评论
点赞
打赏
- 复制链接
- 举报