Recommended Computer Vision Projects from Microsoft Research Asia

ClickI Love Computer Vision to star and get CVML new technologies faster.

Microsoft Research Asia has produced a lot of cutting-edge technologies in computer vision, as well as numerous high-quality open-source projects that are highly regarded. For those who love CV, the following recommendations are not to be missed.

Additionally, at the end of the article, 52CV is hosting an event titled “Win Red Packets by Recommending CV Projects”, welcome to participate.

Author: Microsoft Research Asia Link: https://www.zhihu.com/question/320330671/answer/738260594 Source: Zhihu, authorized for reprint by the author, secondary reproduction is prohibited.

We invited Dr. Wang Jingdong, a senior researcher from the Visual Computing Group at Microsoft Research Asia, and Dr. Yuan Yuhui, a researcher, to recommend numerous projects from Microsoft Research Asia, which are categorized into Object Detection, Semantic and Instance Segmentation, Human Pose Estimation, Face Alignment, Efficient and Lightweight Convolutional Neural Network Architecture Design, Person Re-identification, Video Object Detection, Object Tracking, Nearest Neighbor Search, and other fields.

Object Detection

1. HRNet-Object-Detection

A novel backbone network structure proposed by the Visual Computing Group at Microsoft Research Asia, capable of learning high-resolution representations, improving the spatial precision of object detection, especially for small objects. Supports multi-scale sync-bn training.

Recommended Computer Vision Projects from Microsoft Research Asia

Code:

https://github.com/HRNet/HRNet-Object-Detectionhttps://github.com/HRNet/HRNet-MaskRCNN-Benchmarkhttps://github.com/HRNet/HRNet-FCOS

Paper:

https://arxiv.org/pdf/1904.04514.pdf

2. Deformable Convolutional Networks

A convolutional neural network proposed by the Visual Computing Group at Microsoft Research Asia, capable of modeling geometric deformations.

Code:

https://github.com/msracver/Deformable-ConvNets

Paper:

https://arxiv.org/abs/1703.06211https://arxiv.org/abs/1811.11168

3. Relation Networks

A method proposed by the Visual Computing Group at Microsoft Research Asia that improves object detector performance by utilizing relationships between objects.

Code:

https://github.com/msracver/Relation-Networks-for-Object-Detection

Paper:

https://arxiv.org/pdf/1711.11575.pdf

Semantic and Instance Segmentation

1. HRNet-Semantic-Segmentation

A novel backbone network structure proposed by the Visual Computing Group at Microsoft Research Asia, capable of learning high-resolution representations, effectively improving semantic segmentation performance.

Code:

https://github.com/HRNet/HRNet-Semantic-Segmentationhttps://github.com/HRNet/HRNet-MaskRCNN-Benchmark

Paper:

https://arxiv.org/pdf/1904.04514.pdf

2. Fully Convolutional Instance-Aware Semantic Segmentation

An end-to-end instance segmentation system proposed by the Visual Computing Group at Microsoft Research Asia, designed based on fully convolutional networks, which won the championship in the COCO2016 competition.

Code:

https://github.com/msracver/FCIS

Paper:

https://arxiv.org/pdf/1611.07709.pdf

Human Pose Estimation

1. HRNet-Human-Pose-Estimation

Code:

https://github.com/leoxiaobin/deep-high-resolution-net.pytorch

Paper:

https://arxiv.org/pdf/1902.09212.pdf

2. SimplePose

A network structure proposed by the Visual Computing Group at Microsoft Research Asia for detecting human keypoints, simple and effective.

Code:

https://github.com/Microsoft/human-pose-estimation.pytorch

Paper:

https://arxiv.org/abs/1804.06208

3. Integral Human Pose Regression

A method proposed by the Visual Computing Group at Microsoft Research Asia that uses integral operations to handle the post-processing/quantization non-differentiable issues in 3D human pose estimation tasks.

Code:

https://github.com/JimmySuen/integral-human-pose

Paper:

https://arxiv.org/abs/1711.08229

Face Alignment

1. HRNet-Facial-Landmark-Detection

Code:

https://github.com/HRNet/HRNet-Facial-Landmark-Detection

Paper:

https://arxiv.org/pdf/1904.04514.pdf

Efficient and Lightweight Convolutional Neural Network Structure Design

1. HRNet-Classification

A novel backbone network structure proposed by the Visual Computing Group at Microsoft Research Asia, capable of learning multi-resolution representations and combining them to perform image recognition.

Code:

https://github.com/HRNet/HRNet-Image-Classification

Paper:

https://arxiv.org/pdf/1904.04514.pdf

2. Interleaved Group Convolutions

A lightweight network structure proposed by the Visual Computing Group at Microsoft Research Asia, achieving better results than Google’s MobileNetv2 in classification and detection tasks.

Code:

https://github.com/homles11/IGCV3

Paper:

https://arxiv.org/pdf/1707.02725.pdfhttps://arxiv.org/pdf/1804.06202.pdfhttps://arxiv.org/pdf/1806.00178.pdf

Person Re-identification

1. Deeply-Learned Part-Aligned Representations

A method proposed by the Visual Computing Group at Microsoft Research Asia that utilizes human body part information to extract person representations.

Code:

https://github.com/zlmzju/part_reid

Paper:

http://openaccess.thecvf.com/content_ICCV_2017/papers/Zhao_Deeply-Learned_Part-Aligned_Representations_ICCV_2017_paper.pdf

2. Part-Aligned Bilinear Representations

A method proposed by the Visual Computing Group at Microsoft Research Asia that uses bilinear pooling to combine body poses to extract person representations.

Code:

https://github.com/yuminsuh/part_bilinear_reid

Paper:

http://openaccess.thecvf.com/content_ECCV_2018/papers/Yumin_Suh_Part-Aligned_Bilinear_Representations_ECCV_2018_paper.pdf

Video Object Detection

1. Deep Feature Flow

A network structure proposed by the Visual Computing Group at Microsoft Research Asia for video understanding, which uses optical flow information between videos to propagate predictions between adjacent frames.

Code:

https://github.com/msracver/Deep-Feature-Flow

Paper:

https://arxiv.org/abs/1611.07715

2. Flow-Guided Feature Aggregation

A framework proposed by the Visual Computing Group at Microsoft Research Asia to solve object detection problems in videos, utilizing optical flow to help combine representations of adjacent frames.

Code:

https://github.com/msracver/Flow-Guided-Feature-Aggregation

Paper:

https://arxiv.org/pdf/1703.10025.pdf

Object Tracking

1. Deeper and Wider Siamese Networks

A deeper and wider Siamese network proposed by the Multimedia Search and Mining Group at Microsoft Research Asia to solve the object tracking problem.

Code:

https://github.com/researchmm/SiamDW

Paper:

https://arxiv.org/abs/1901.01660

Nearest Neighbor Search

1. SPTAG

An indexing and search system jointly launched by the system group at Microsoft Research Asia and the Bing group that can handle indexing and searching for billions of data, already used in Microsoft Bing products.

Code:

https://github.com/Microsoft/SPTAG

Paper:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.862.7975&rep=rep1&type=pdfhttp://pages.ucsd.edu/~ztu/publication/cvpr12_knnG.pdf，

https://ieeexplore.ieee.org/iel7/34/4359286/06549106.pdf

2. Composite Quantization

An efficient compact coding (hashing) algorithm proposed by the Visual Computing Group at Microsoft Research Asia.

Code:

https://github.com/hellozting/CompositeQuantization

Paper:

http://proceedings.mlr.press/v32/zhangd14.pdf

CV Project Recommendation Activity

Everyone is welcome to leave a message at the end of the article to recommend high-quality CV open-source projects. The deadline is 12 noon on August 19th. The top liked open-source project recommenders will receive a 50 yuan red packet from CV Jun.

Messages must include project introduction and open-source address, for example:

Project Introduction:

Face Analysis Project on MXNet

Address:

https://github.com/deepinsight/insightface

ps. All 52CV fans can participate in the recommendation, but to prevent spam voters, the award-winning users of this activity are limited to fans who followed before August 16. If the top liked users do not meet the criteria, the quota will be passed on, thank you for your understanding.

CV Subfield Communication Group

52CV has established multiple professional communication groups for CV, including: object tracking, object detection, semantic segmentation, pose estimation, face recognition and detection, medical image processing, super-resolution, neural architecture search, GAN, reinforcement learning, etc. Scan the code to add CV Jun to pull you into the group,

(Please be sure to indicate the relevant direction, for example:Object Detection)

If you like to communicate on QQ, you can add the official QQ group of 52CV: 805388940.

(I won’t be online all the time, so please forgive me if I can’t verify you in time.

Long press to followI Love Computer Vision

Object Detection

Semantic and Instance Segmentation

Human Pose Estimation

Face Alignment

Efficient and Lightweight Convolutional Neural Network Structure Design

Person Re-identification

Video Object Detection

Object Tracking

Nearest Neighbor Search

Leave a Comment Cancel reply