I’m a first year graduate student in
School of Computer Science, Carnegie Mellon University. I am advised
by Prof. Garth Gibson.
Before coming to CMU, I finished my B.S. degree in Computer Science in Peking University. I was born in a historic city Jingdezhen, Jiangxi, China.
Here is my Resume
My current research focuses on improving the Namenode scalability of Hadoop Distributed File System in ShardFS Project.
I am interested in building distributed systems and large scale storage systems.
Graduate courses I’ve taken:
- Fall 2012
- 15648: Studio in Big Data Systems (I)
- LogAnalyzer: HDFS log analyzer to get dynamic file access statistics.
- 15640: Distributed Systems
- Tibbler: Twitter-like distributed information dissemination system.
- Godocs: Google docs implementation with PAXOS based key-value store system in Golang.
- 15826: Multimedia Database and Data Mining
- PIGFLY: Scalable graph mining algorithms implementation on Apache Pig
- 10601: Machine Learning
- Spring 2013
- 15648: Studio in Big Data Systems (II)
- ShardFS/IGrow: Online incremental growth of namenodes in ShardFS based on Federated HDFS.
- 15746: Advanced Storage Systems
- Fsck: Fsck utility implementation for the ext2 file system.
- CloudFS: Hybrid SSD and cloud storage system with deduplication and cache.
- 15618: Parallel Computer Architecture and Programming
- GPUSA: Parallel Suffix Array Construction algorithm implementation on GPU
- 15619: Cloud Computing
- Hands on experience using Amazon Web Services(AWS): EC2, S3, CloudWatcher, Auto Scaling, Load Balancer, Elastic MapReduce, DynamoDB, HBase.
- 10601: Machine Learning (Grader)
- Fall 2013
- Capstone project
- TA in 15619: Cloud Computing
© 2013 Chaomin Yu