go back to the main page

SSS Abstracts
Fall 2012

go back to the main page

A Human-Like Intelligent Agent with Applications in Intelligent Tutoring

Tuesday, September 11^th, 2012 from 12-1 pm in GHC 4405.

Presented by Nan Li, CSD

Building an intelligent agent that simulates human learning of math and science could potentially benefit both cognitive science, by contributing to the understanding of human learning, and artificial intelligence, by advancing the goal of creating human-level intelligence. SimStudent is a learning agent that acquires procedural knowledge by observing a tutor solve sample problems and by getting feedback while actively solving problems on its own. It has been integrated into an existing authoring tool for Cognitive Tutors. End-users can create intelligent tutoring systems by teaching SimStudent rather than by programming.

However, constructing such a learning agent currently requires manual encoding of domain-relevant prior knowledge; in addition to being a poor model of human acquisition of prior knowledge, manual knowledge-encoding is both time-consuming and error-prone. In this work, we propose an efficient algorithm that acquires domain representations (via unsupervised perceptual chunking) and demonstrate its effectiveness in math and science domains. We integrate this algorithm into SimStudent, and show that learning domain representations reduces the requirements for knowledge engineering. Moreover, we propose an approach that automatically discovers student models using the extended SimStudent and demonstrate how the discovered model may be used to improve a tutoring system's instructional strategy.

Object Propositions

Friday, September 14^th, 2012 from 12-1 pm in GHC 4303.

Presented by Ligia Nistor, CSD

The presence of aliasing makes modular verification of object-oriented code difficult. If multiple clients depend on the properties of an object, one client may break a property that others depend on.

We have developed a modular verification approach based on the novel abstraction of object propositions, which combine predicates and information about object aliasing. In our methodology, even if shared data is modified, we know that an object invariant specified by a client holds.

Our permission system allows verification using a mixture of linear and nonlinear reasoning. This allows it to provide more modularity in some cases than competing separation logic approaches, because it can more effectively hide the exact aliasing relationships within a module. We validate our approach in an instance of the composite pattern that illustrates our system's practicality.

Joint work with Jonathan Aldrich, Stephanie Balzer and Hannes Mehnert

Presented in Partial Fulfillment of the CSD Speaking Skills Requirement.

Exact and Approximate Computation of a Histogram of Pairwise Distances between Astronomical Objects

Friday, October 5^th, 2012 from 12-1 pm in GHC 4303.

Presented by Bin Fu, CSD

We compare several alternative approaches to computing correlation functions, which is a cosmological application for analyzing the distribution of matter in the universe. This computation involves counting the pairs of galaxies within a given distance from each other and building a histogram that shows the dependency of the number of pairs on the distance. We propose a novel approximation algorithm, which outperforms each of the existing technique. We also parallelize our algorithm and evaluate its performance.

Presented in Partial Fulfillment of the CSD Speaking Skills Requirement.

A Generalization of SAT and #SAT for Probabilistic Inference and Verification

Friday, October 26^th, 2012 from 12-1 pm in GHC 4303.

Presented by Erik Zawadzki, CSD

In this talk we considering a generalization of both Boolean satisfiability (SAT) and #SAT where a subset of variables are assigned randomly. In these problems, rather than asking "is a formula satisfiable?", we ask "how likely is a formula to be satisfiable?". We call this type of bounded-alternation stochastic satisfiability problem #E-SAT.

This problem is complete for a particular complexity class that we will compare to more traditional counting complexity classes like #P. Problem instances from #E-SAT are well-suited to a class of problems that we call 'robust verfication'. We present several examples of problems from this class including simple circuit verification problems and job shop scheduling problems. Two different solvers for #E-SAT are presented and compared.

Presented in Partial Fulfillment of the CSD Speaking Skills Requirement.

Y-ray: Scalable Real-Time Extraction of File Updates from VMs

Tuesday, November 27^th, 2012 from 12-1 pm in GHC 4405.

Presented by Wolfgang Richter, CSD

We describe a new mechanism for cloud computing that enables real-time monitoring of file updates across an entire cloud without guest VM support. Its attributes include low virtual I/O overhead, low latency for emitting log file updates, and a scalable design. This mechanism works by asynchronous inference of file update operations from introspected disk sector writes. System administration tasks that involve monitoring of files can be performed outside VM instances, either locally or remotely. We demonstrate and validate this approach in an experimental prototype.

Joint work with Canturk Isci (IBM Research), Jan Harkes, Benjamin Gilbert, Vasanth Bala (IBM Research), Mahadev Satyanarayanan

Presented in Partial Fulfillment of the Speaking Requirement

SOFTScale: Stealing Opportunistically For Transient Scaling

Friday, November 30^th, 2012 from 12-1 pm in GHC 4303.

Presented by Anshul Gandhi, CSD

Dynamic capacity provisioning is a well studied approach to handling gradual changes in data center load. However, abrupt spikes in load are still problematic in that the work in the system rises very quickly during the setup time needed to turn on additional capacity. Performance can be severely affected even if it takes only 5 seconds to bring additional capacity online.

In this talk, I will introduce SOFTScale, an approach to handling load spikes in multi-tier data centers without having to over-provision resources. SOFTScale works by opportunistically stealing resources from other tiers to alleviate the bottleneck tier, even when the tiers are carefully provisioned at capacity. Via implementation on a 28-server multi-tier testbed, we investigate a range of possible load spikes, and demonstrate that SOFTScale can meet response time goals without using any additional resources.

Joint work with: Timothy Zhu (CMU) Mor Harchol-Balter (CMU) Michael A. Kozuch (Intel)

Presented in Partial Fulfillment of the Speaking Requirement

Scaling Metadata Management in Hadoop Distributed File System

Tuesday, December 4^th, 2012 from 12-1 pm in GHC 4405.

Presented by Lin Xiao, CSD

Hadoop distributed file system (HDFS) is an open source system for handling large datasets following the design of Google file system. Assuming most files are large, HDFS stores all metadata in memory in one metadata server called namenode. However, files are found to be small in reality. As data sets grow, we expect the single namenode to be the bottleneck of the system. We are exploring different solutions to scale metadata in HDFS.

HDFS federation uses multiple independent namenodes to scale namespace horizontally. Each namenode only knows and manages its own namespace without coordination with others, which leaves the clients to decide how to balanced metadata workload among these namenodes without much performance degradation. We will present different client-side usage of the federated namenodes and the trade-offs.

The other approach is to rewrite namenode to be a scalable key-value store with specialized file system function. To remove the limitation on the amount of metadata, we will use out-of-core representation. One problem is the efficiency and performance of the out-of-core metadata. We present the performance overhead for using the out-of-core metadata with single namenode.

Presented in Partial Fulfillment of the CSD Speaking Skills Requirement.

Relaxed Memory Models and the Steel Programming Language

Friday, December 7^th, 2012 from 12-1 pm in GHC 4303.

Presented by Luke Zarko, CSD

Before the widespread adoption of shared memory multiprocessors, one could generally assume that the memory system was sequentially consistent. In such a machine, actions on the memory are ordered globally; furthermore, these actions occur in a way that is in accordance with the original program order. This is no longer the case. Instructions may be handled out-of-order; memory requests may be shuffled and buffered in multiple places; observers may disagree about the overall sequence of events. Most importantly, all of these internal details are visible to the programmer. These relaxations of the sequentially-consistent view of memory are necessary for allowing low latency access and conservative power requirements in modern multicore machines.

Authors of low-level libraries for synchronization and users of lock-free algorithms must be aware of these behaviors, which differ (sometimes significantly) between each architecture. It is therefore important that systems programming languages provide abstractions for target-independent reasoning about memory. In this spirit, the recently-updated C++ standard now permits ordering annotations on memory operations; however, the definition of these constraints still makes it difficult to reason about the ultimate behavior of a multithreaded program. In particular, because these constraints are specified on each operation in isolation, the relationships between annotated actions--what the programmer is really concerned with expressing, and what the optimizing backend must preserve--must still be inferred.

In our nascent imperative programming language Steel, programmers directly describe relationships between memory operations rather than properties of individual operations. It is then the duty of the compiler to synthesize a combination of barriers and dependencies that effects a runtime ordering that respects those relationships. In this talk, I will present a summary of the behavior of a modern relaxed memory system (specifically, the ARM Cortex-A9 MPCore) and will show how our new abstractions and compiler backend enable programmers to address these issues.

Joint work with Michael Arntzenius, Karl Crary, David Eckhardt, Salil Joshi, and Michael Sullivan.

Presented in Partial Fulfillment of the CSD Speaking Skills Requirement.

Web contact: sss+www@cs