*mmcv

- MMCV is a foundational python library for computer vision research and supports many research projects in MMLAB, such as MMDetection and MMAction.

 

*ICLR

- International Conference on Learning Representations

 

*NMS

- non maximum suppression

- 연산량을 줄이고, mAP도 높힌다.

- 일반적으로 영상에지를 찾기 위한 NMS는 현재 픽셀을 기준으로 주변의 픽셀과 비교했을 때 최대값인 경우 그대로 놔두고, 아닐 경우(비 최대) 억제(제거)하는 것

- 딥러닝을 이용한 Object Detection에서는 대부분 각종 boundingbox + 각 box에 object가 있을 확률 (class별 확률)들이 나오게 되는데, 이중 겹치는 부분(차 한대에 여러가지 boundingbox가 그려지는 경우와 같은)을 제거하기 위한 방법으로 사용된다. 

 

*eval()

- 추론을 실행하기 전에는 반드시 model.eval() 을 호출하여 드롭아웃 및 배치 정규화를 평가 모드로 설정하여야 합니다. 이것을 하지 않으면 추론 결과가 일관성 없게 출력됩니다.

 

*NCCL 백엔드(NVIDIA Collective Communications Library)

- NCCL 백엔드는 CUDA Tensor들에 대한 집합 연산의 최적화된 구현체를 제공

- 집합 연산에 CUDA Tensor만 사용하는 경우, 동급 최고 성능을 위해 이 백엔드를 사용하는 것을 고려

- NCCL 백엔드는 미리 빌드(pre-built)된 바이너리에 CUDA 지원과 함께 포함됨

 

*

model = dict(

    type='CascadeRCNN',

    num_stages=3,

    pretrained='cascade_rcnn_r50_fpn_lx_20190501-3b6211ab.pth'

    backbone=dict(

        type='ResNet',

        depth=50,

        num_stages=4,

        out_indices=(0,1,2,3),

        frozen_stages=1,

        style='pytorch'),

    neck=dict(

        type='FPN',

        in_channels=[256, 512, 1024, 2048],

        out_channels=256,

        num_outs=5),

    rpn_head=dict(

        type='RPNHead',

        in_channels=256,

        feat_channels=256,

        anchor_scales=[8],

        anchor_ratios=[1.0, 2.0],

        anchor_strides=[4, 8, 16, 32, 64],

        target_means=[.0, .0, .0, .0],

        target_stds=[1.0, 1.0, 1.0, 1.0],

        loss_cls=dict(

            type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),

        loss_bbox=dict(type='SmoothL1Loss', beta=1.0/9.0, loss_weight=1.0)),

    bbox_roi_extractor=dict(

        type='SingleRoIExtractor',

        roi_layer=dict(type='RoIAlign', out_size=7, sample_num=2),

        out_channels=256,

        featmap_strides=[4,8,16,32]),

    bbox_head=[

        dict(

            type='SharedFCBBoxHead',

            num_fcs=2,

            in_channels=256,

            fc_out_channels=1024,

            roi_feat_size=7,

            num_classes=81,

            target_means=[0., 0., 0., 0.],

            target_stds=[0.1, 0.1, 0.2, 0.2],

            reg_class_agnostic=True,

            loss_cls=dict(

                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),

            loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)),

        dict(

            type='SharedFCBBoxHead',

            num_fcs=2,

            in_channels=256,

            fc_out_channels=1024,

            roi_feat_size=7,

            num_classes=81,

            target_means=[0., 0., 0., 0.],

            target_stds=[0.05, 0.05, 0.1, 0.1],

            reg_class_agnostic=True,

            loss_cls=dict(

                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),

            loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)),

        dict(

            type='SharedFCBBoxHead',

            num_fcs=2,

            in_channels=256,

            fc_out_channels=1024,

            roi_feat_size=7,

            num_classes=81,

            target_means=[0., 0., 0., 0.],

            target_stds=[0.033, 0.033, 0.067, 0.067],

            reg_class_agnostic=True,

            loss_cls=dict(

                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),

            loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))])

 

train_cg = dict(

    rpn=dict(

        assigner=dict(

            type='MaxIoUAssigner',

            pos_iou_thr=0.7,

            neg_iout_thr=0.3,

            min_pos_iou=0.3,

            ignore_iof_thr=-1),

        sampler=dict(

            type='RandomSampler',

            num=256,

            pos_fraction=0.5,

            neg_pos_ub=-1,

            add_gt_as_proposals=False),

        allowed_border=0,

        pos_weight=-1,

        debug=False),

    rpn_proposal=dict(

        nms_across_levels=False,

        nms_pre=2000,

        nms_post=2000,

        max_num=2000,

        nms_thr=0.7,

        min_bbox_size=0),

    rcnn=[

        dict(

            assigner=dict(

                type='MaxIoUAssigner',

                pos_iou_thr=0.5,

                neg_iou_ths=0.5,

                min_pos_iou=0.5,

                ignore_iof_thr=-1),

            sampler=dict(

                type='RandomSampler',

                num=512,

                pos_fraction=0.25,

                neg_pos_ub=-1,

                add_gt_as_proposals=True),

            pos_weight=-1,

            debug=False),

        dict(

            assigner=dict(

                type='MaxIoUAssigner',

                pos_iou_thr=0.6,

                neg_iou_ths=0.6,

                min_pos_iou=0.6,

                ignore_iof_thr=-1),

            sampler=dict(

                type='RandomSampler',

                num=512,

                pos_fraction=0.25,

                neg_pos_ub=-1,

                add_gt_as_proposals=True),

            pos_weight=-1,

            debug=False),

        dict(

            assigner=dict(

                type='MaxIoUAssigner',

                pos_iou_thr=0.7,

                neg_iou_ths=0.7,

                min_pos_iou=0.7,

                ignore_iof_thr=-1),

            sampler=dict(

                type='RandomSampler',

                num=512,

                pos_fraction=0.25,

                neg_pos_ub=-1,

                add_gt_as_proposals=True),

            pos_weight=-1,

            debug=False)],

    stage_loss_weights=[1, 0.5, 0.25])

 

test_cfg = dict(

    rpn=dict(

        nms_across_levels=False,

        nms_pre=1000,

        nms_post=1000,

        max_num=1000,

        nms_thr=0.7,

        min_bbox_size=0),

    rcnn=dict(

        score_thr=0.05,

        nms=dict(type='nms', iou_thr=0.5),

        max_per_img=100))

 

dataset_type='LabelmeCocoDataset'

data_root=''

img_norm_cfg=dict(

    mean=[18.005, 17.888, 17.764],

    std=[44.024, 43.750, 43.470],

    to_rgb=True)

 

train_pipline = [

    dict(type='LoadImageFromFile'),

    dict(type='LoadAnnotations', with_bbox=True, with_mask=False),

    dict(type='Resize', img_scale=(1333,800), keep_ratio=True),

    dict(type='RandomFlip', flip_ratio=0),

    dict(type='Normalize', **img_norm_cfg),

    dict(type='Pad', size_divisor=32),

    dict(type='DefaultFormatBundle'),

    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),]

 

val_pipeline = [

    dict(type='LoadImageFromFile'),

    dict(type='LoadAnnotations', with_bbox=True, with_mask=False),

    dict(type='Resize', img_scale=(1333,800), keep_ratio=True),

    dict(type='RandomFlip', flip_ratio=0),

    dict(type='Normalize', **img_norm_cfg),

    dict(type='Pad', size_divisor=32),

    dict(type='DefaultFormatBundle'),

    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),]

 

test_pipeline = [

    dict(type='LoadImageFromFile'),

    dict(

        type='MultiScaleFlipAug',

        img_scale=(1333, 800),

        flip=Fasle,

        transforms=[

            dict(type='Resize', keep_ratio=True),

            dict(type='RandomFlip'),

            dict(type='Normalize', **img_norm_cfg),

            dict(type='Pad', size_divisor=32),

            dict(type='ImageToTensor', keys=['img']),

            dict(type='Collect', keys=['img']) ]) ]

 

data = dict(

   imgs_per_gpu=6,

   workers_per_gpu=6,

   train=dict(

       type=dataset_type,

       ann_file='',

       pipeline=train_pipeline),

    val=dict(

        type=dataset_type,

        ann_file='',

        pipeline=test_pipeline))

 

optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0005)

optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))

 

lr_config = dict(

    policy='step',

    warmup='linear',

    warmup_iters=500,

    warmup_ratio=1.0/3,

    step=[8, 11])

 

checkpoint_config = dict(interval=1)

 

log_config = dict(

    interval=10,

    hooks=[

        dict(type='TextLoggerHook'),

        dict(type='TensorboardLoggerHook')])

 

total_epochs = 30

dist_params = dict(backend='nccl')

log_level = 'INFO'

work_dir = ''

load_from = None

resume_from = None

workflow = [('train', 1)]

 

 

반응형

'스타트업 > AI' 카테고리의 다른 글

[AI] FPN  (0) 2020.03.05
[AI] ResNeXt  (0) 2020.03.05
[AI] 논문 리스트  (0) 2020.02.21
[AI] GAN  (0) 2020.02.10
[AI] autograd, torchvision.transforms  (0) 2020.02.05

+ Recent posts