instruct bleu Attention-UNet
instruct based HuggingFace implementation for foundation attention-head.
- Input
- 4381-dim embedding
- Encoder
- 63 x Attention-UNet with 48 heads
- Output
- accuracy projection
Training config
optimizer=NAdam, lr=0.931, scheduler=polynomial, warmup=541