[AI] cascade_rcnn_r50_fpn_1x

2020. 2. 25. 19:00

*mmcv

- MMCV is a foundational python library for computer vision research and supports many research projects in MMLAB, such as MMDetection and MMAction.

*ICLR

- International Conference on Learning Representations

*NMS

- non maximum suppression

- 연산량을 줄이고, mAP도 높힌다.

- 일반적으로 영상에지를 찾기 위한 NMS는 현재 픽셀을 기준으로 주변의 픽셀과 비교했을 때 최대값인 경우 그대로 놔두고, 아닐 경우(비 최대) 억제(제거)하는 것

- 딥러닝을 이용한 Object Detection에서는 대부분 각종 boundingbox + 각 box에 object가 있을 확률 (class별 확률)들이 나오게 되는데, 이중 겹치는 부분(차 한대에 여러가지 boundingbox가 그려지는 경우와 같은)을 제거하기 위한 방법으로 사용된다.

*eval()

- 추론을 실행하기 전에는 반드시 model.eval() 을 호출하여 드롭아웃 및 배치 정규화를 평가 모드로 설정하여야 합니다. 이것을 하지 않으면 추론 결과가 일관성 없게 출력됩니다.

*NCCL 백엔드(NVIDIA Collective Communications Library)

- NCCL 백엔드는 CUDA Tensor들에 대한 집합 연산의 최적화된 구현체를 제공

- 집합 연산에 CUDA Tensor만 사용하는 경우, 동급 최고 성능을 위해 이 백엔드를 사용하는 것을 고려

- NCCL 백엔드는 미리 빌드(pre-built)된 바이너리에 CUDA 지원과 함께 포함됨

model = dict(

type='CascadeRCNN',

num_stages=3,

pretrained='cascade_rcnn_r50_fpn_lx_20190501-3b6211ab.pth'

backbone=dict(

type='ResNet',

depth=50,

num_stages=4,

out_indices=(0,1,2,3),

frozen_stages=1,

style='pytorch'),

neck=dict(

type='FPN',

in_channels=[256, 512, 1024, 2048],

out_channels=256,

num_outs=5),

rpn_head=dict(

type='RPNHead',

in_channels=256,

feat_channels=256,

anchor_scales=[8],

anchor_ratios=[1.0, 2.0],

anchor_strides=[4, 8, 16, 32, 64],

target_means=[.0, .0, .0, .0],

target_stds=[1.0, 1.0, 1.0, 1.0],

loss_cls=dict(

type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),

loss_bbox=dict(type='SmoothL1Loss', beta=1.0/9.0, loss_weight=1.0)),

bbox_roi_extractor=dict(

type='SingleRoIExtractor',

roi_layer=dict(type='RoIAlign', out_size=7, sample_num=2),

out_channels=256,

featmap_strides=[4,8,16,32]),

bbox_head=[

dict(

type='SharedFCBBoxHead',

num_fcs=2,

in_channels=256,

fc_out_channels=1024,

roi_feat_size=7,

num_classes=81,

target_means=[0., 0., 0., 0.],

target_stds=[0.1, 0.1, 0.2, 0.2],

reg_class_agnostic=True,

loss_cls=dict(

type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),

loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)),

dict(

type='SharedFCBBoxHead',

num_fcs=2,

in_channels=256,

fc_out_channels=1024,

roi_feat_size=7,

num_classes=81,

target_means=[0., 0., 0., 0.],

target_stds=[0.05, 0.05, 0.1, 0.1],

reg_class_agnostic=True,

loss_cls=dict(

type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),

loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)),

dict(

type='SharedFCBBoxHead',

num_fcs=2,

in_channels=256,

fc_out_channels=1024,

roi_feat_size=7,

num_classes=81,

target_means=[0., 0., 0., 0.],

target_stds=[0.033, 0.033, 0.067, 0.067],

reg_class_agnostic=True,

loss_cls=dict(

type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),

loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))])

train_cg = dict(

rpn=dict(

assigner=dict(

type='MaxIoUAssigner',

pos_iou_thr=0.7,

neg_iout_thr=0.3,

min_pos_iou=0.3,

ignore_iof_thr=-1),

sampler=dict(

type='RandomSampler',

num=256,

pos_fraction=0.5,

neg_pos_ub=-1,

add_gt_as_proposals=False),

allowed_border=0,

pos_weight=-1,

debug=False),

rpn_proposal=dict(

nms_across_levels=False,

nms_pre=2000,

nms_post=2000,

max_num=2000,

nms_thr=0.7,

min_bbox_size=0),

rcnn=[

dict(

assigner=dict(

type='MaxIoUAssigner',

pos_iou_thr=0.5,

neg_iou_ths=0.5,

min_pos_iou=0.5,

ignore_iof_thr=-1),

sampler=dict(

type='RandomSampler',

num=512,

pos_fraction=0.25,

neg_pos_ub=-1,

add_gt_as_proposals=True),

pos_weight=-1,

debug=False),

dict(

assigner=dict(

type='MaxIoUAssigner',

pos_iou_thr=0.6,

neg_iou_ths=0.6,

min_pos_iou=0.6,

ignore_iof_thr=-1),

sampler=dict(

type='RandomSampler',

num=512,

pos_fraction=0.25,

neg_pos_ub=-1,

add_gt_as_proposals=True),

pos_weight=-1,

debug=False),

dict(

assigner=dict(

type='MaxIoUAssigner',

pos_iou_thr=0.7,

neg_iou_ths=0.7,

min_pos_iou=0.7,

ignore_iof_thr=-1),

sampler=dict(

type='RandomSampler',

num=512,

pos_fraction=0.25,

neg_pos_ub=-1,

add_gt_as_proposals=True),

pos_weight=-1,

debug=False)],

stage_loss_weights=[1, 0.5, 0.25])

test_cfg = dict(

rpn=dict(

nms_across_levels=False,

nms_pre=1000,

nms_post=1000,

max_num=1000,

nms_thr=0.7,

min_bbox_size=0),

rcnn=dict(

score_thr=0.05,

nms=dict(type='nms', iou_thr=0.5),

max_per_img=100))

dataset_type='LabelmeCocoDataset'

data_root=''

img_norm_cfg=dict(

mean=[18.005, 17.888, 17.764],

std=[44.024, 43.750, 43.470],

to_rgb=True)

train_pipline = [

dict(type='LoadImageFromFile'),

dict(type='LoadAnnotations', with_bbox=True, with_mask=False),

dict(type='Resize', img_scale=(1333,800), keep_ratio=True),

dict(type='RandomFlip', flip_ratio=0),

dict(type='Normalize', **img_norm_cfg),

dict(type='Pad', size_divisor=32),

dict(type='DefaultFormatBundle'),

dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),]

val_pipeline = [

dict(type='LoadImageFromFile'),

dict(type='LoadAnnotations', with_bbox=True, with_mask=False),

dict(type='Resize', img_scale=(1333,800), keep_ratio=True),

dict(type='RandomFlip', flip_ratio=0),

dict(type='Normalize', **img_norm_cfg),

dict(type='Pad', size_divisor=32),

dict(type='DefaultFormatBundle'),

dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),]

test_pipeline = [

dict(type='LoadImageFromFile'),

dict(

type='MultiScaleFlipAug',

img_scale=(1333, 800),

flip=Fasle,

transforms=[

dict(type='Resize', keep_ratio=True),

dict(type='RandomFlip'),

dict(type='Normalize', **img_norm_cfg),

dict(type='Pad', size_divisor=32),

dict(type='ImageToTensor', keys=['img']),

dict(type='Collect', keys=['img']) ]) ]

data = dict(

imgs_per_gpu=6,

workers_per_gpu=6,

train=dict(

type=dataset_type,

ann_file='',

pipeline=train_pipeline),

val=dict(

type=dataset_type,

ann_file='',

pipeline=test_pipeline))

optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0005)

optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))

lr_config = dict(

policy='step',

warmup='linear',

warmup_iters=500,

warmup_ratio=1.0/3,

step=[8, 11])

checkpoint_config = dict(interval=1)

log_config = dict(

interval=10,

hooks=[

dict(type='TextLoggerHook'),

dict(type='TensorboardLoggerHook')])

total_epochs = 30

dist_params = dict(backend='nccl')

log_level = 'INFO'

work_dir = ''

load_from = None

resume_from = None

workflow = [('train', 1)]

'스타트업 > AI' 카테고리의 다른 글

[AI] FPN (0)	2020.03.05
[AI] ResNeXt (0)	2020.03.05
[AI] 논문 리스트 (0)	2020.02.21
[AI] GAN (0)	2020.02.10
[AI] autograd, torchvision.transforms (0)	2020.02.05

MezzanineX

[AI] cascade_rcnn_r50_fpn_1x

'스타트업 > AI' 카테고리의 다른 글

+ Recent posts

티스토리툴바