easy-to-implement strong baseline for irregular scene text recognition, using off-the-shelf neural network components and only word-level annotations
31-layer ResNet, an LSTM-based encoder-decoder framework and a 2-dimensional attention module
*Model
(1) ResNet CNN for feature extraction
(2) 2D-attention based encoder-decoder model
*Implementation Details
- Torch
- NVIDIA Titan X GPU with 12GB Memory
- cross-entropy loss
- without pre-training
- ADAM optimizer(learning rate = 0.001, decay rate = 0.9 every 10000 iterations until it reaches 0.00001
반응형
'스타트업 > AI' 카테고리의 다른 글
[AI] ablation study (0) | 2020.03.27 |
---|---|
[AI] EfficientDet : Scalable and Efficient Object Detection (0) | 2020.03.27 |
[AI] Bayesian Optimization (딥러닝 모델 hyper-parameter 탐색 방법론) (0) | 2020.03.26 |
[AI] torch tensor .T (0) | 2020.03.26 |
[AI] 데이터 3법 (0) | 2020.03.25 |