I am a PhD candidate at the Robotics Institute in Carnegie Mellon University, working with Simon Lucey.

I got my Bachelor's degree from the Automation, Tsinghua University with honors.

My interests

  • - 3D Reconstruction: structure from motion, non-rigid structure from motion, single image 3D reconstruction, etc
  • - Signal Processing: sparse coding, dictionary learning, etc
  • - Deep Learning

Email:
chenk@cs.cmu.edu

Tel:
(412)535-2507

Address:
5000 Forbes Avenue
Pittsburgh, PA, USA 15213

Selected Projects

Non-Rigid Structure from Motion

We propose that non-rigid 3D structures are well modeled by a dictionary sparsely, a more expressive and general assumption. Based on this basic assumption we present two approaches without the aid of additional priors (e.g. temporal ordering, rigid substructures, etc.) to solve non-rigid structure from motion problem.

  • C. Kong and S. Lucey. Prior-less compressible structure from motion. In Computer Vision and Pattern Recognition (CVPR), 2016.
  • C. Kong, R. Zhu, H. Kiani, and S. Lucey. Structure from category: a generic and prior-less approach. In International Conference on 3DVision (3DV), 2016.

Single Image 3D Reconstruction

We investigate the problem of estimating the dense 3D shape of an object, given a set of 2D landmarks and silhouette in a single image. An obvious prior to employ in such a problem is myriad of dense CAD models online. As each model is manually and independently designed and does not necessarily share the same number of vertices or the same structure of meshes. We propose a novel graph embedding based on local dense correspondence to allow for sparse linear combinations of CAD models.

  • C. Kong, C. Lin, and S. Lucey. Using Locally Corresponding CAD Models for Dense 3D Reconstructions from a Single Image. In Computer Vision and Pattern Recognition (CVPR), 2017.
  • J. Pontes, C. Kong, A. Eriksson, C. Fookes, S. Sridharan, and S. Lucey. Compact model representation for 3D reconstruction. In International Conference on 3DVision (3DV), 2017.

Analysis on Deep learning

We analyze convolutional neural network via convolutional sparse coding. We propose to simplify the CNN architecture by replacing non-unity stride by unity stride. We employ a novel alternation strategy for CNN training that leads to substantially faster convergence rates, nice theoretical properties, and achieving state of the art results across large scale datasets (e.g. ImageNet) as well as other standard benchmarks.

  • C. Kong and S. Lucey. Take it in your stride: Do we need striding in CNNs?. Under Review.
  • C. Huang, C. Kong, and S. Lucey. CNNs are Globally Optimal Given Multi-Layer Support. Under Review.

Papers

Take it in your stride: Do we need striding in CNNs?

Chen Kong and Simon Lucey

arXiv Code (coming soon)

CNNs are Globally Optimal given Multi-Layer Support

Chen Huang, Chen Kong, and Simon Lucey

arXiv Code (coming soon)

Image2Mesh: A Learning Framework for Single Image 3D Reconstruction

Jhony Kaesemodel Pontes, Chen Kong, Sridha Sridharan, Anders Eriksson, Simon Lucey, and Clinton Fookes

arXiv (coming soon) Code (coming soon)

Learning Efficient Point Cloud Generation for Dense 3D Object Reconstruction

Chen-Hsuan Lin, Chen Kong, and Simon Lucey

AAAI Conference on Artificial Intelligence (AAAI), 2018

Paper Poster arXiv Code (coming soon)

Compact model representation for 3D reconstruction

Jhony K. Pontes, Chen Kong, Anders Eriksson, Clinton Fookes, Sridha Sridharan, and Simon Lucey

International Conference on 3DVision (3DV), 2017

Page Paper arXiv Video Bitbex

Using Locally Corresponding CAD Models for Dense 3D Reconstructions from a Single Image

Chen Kong, Chen-Hsuan Lin, and Simon Lucey

Computer Vision and Pattern Recognition (CVPR), 2017

Paper Poster Code (coming soon)

Structure from category: a generic and prior-less approach

Chen Kong, Rui Zhu, Hamed Kiani, and Simon Lucey

International Conference on 3DVision (3DV), 2016

Paper Poster Code

Prior-less compressible structure from motion

Chen Kong and Simon Lucey

Computer Vision and Pattern Recognition (CVPR), 2016

Paper Poster Code Page

Generating Multi-Sentence Lingual Descriptions of Indoor Scenes

Dahua Lin, Chen Kong, Sanja Fidler, and Raquel Urtasun

British Machine Vision Conference (BMVC), 2015

Paper

What are you talking about? text-to-image coreference

Chen Kong, Dahua Lin, Mohit Bansal, Raquel Urtasun, and Sanja Fidler

Computer Vision and Pattern Recognition (CVPR), 2014

Paper Data Page

Visual semantic search: Retrieving videos via complex textual queries

Dahua Lin, Sanja Fidler, Chen Kong, and Raquel Urtasun

Computer Vision and Pattern Recognition (CVPR), 2014

Paper Supplementary Material