Reviewer 1 of SIMPAR 2016 submission 58
Comments to the author
======================
This paper presents a supervised learning framework for
quickly estimating the robustness of a parallel-jaw grasp.
When considering uncertainty in pose and friction, using
Monte Carlo sampling methods for estimating the success
probability is time consuming. To achieve faster robustness
estimates, the authors use a vast library of object models
to train two supervised learning algorithms (Random Forests
and Deep Learning), such that individual grasp robustness
estimates can be computed 3-4 orders of magnitude faster.
Generally, the approach is sound and the results are
encouraging, but the clarity of discussions, data, and
motivation could be improved. Specifically, please address
the following:
ï The motivation for why fast robustness computations are
needed is not clear, and virtually non-existent in the
introduction. Only in the last sentence of the conclusion
do the authors suggest how such fast computations may
enable better grasp planning. Please make it clear in the
introduction how drastic speedups in robustness computation
can (a) improve existing applications and/or (b) enable new
methods or applications for grasping.
ï In Sect. III, comment on why a parallel jaw grasp is
defined by only two contact points (c1 and c2). Arenít
there cases in which each jaw may contact the object at
more than one point? Are these rare corner cases or common?
Are there cases in which an unstable two-point grasp
reorients into a stable 3 or 4 point grasp?
ï Objects are assumed rigid, but no object it truly rigid.
Comment on the level of object compliance that can be
modeled with this rigid assumption.
ï The grasp featurization is not clear. Specifically, how
are the inscribed disks (D1 and D2) computed? A schematic
showing an example object, contact point (c1), normal (n1),
depth map (M1), and inscribed disk (D1) would be very
helpful.
ï For Fig. 4, the strictly decreasing histogram of
robustness for grasps outside the friction cone makes
sense, but the low robustness peak (at zero) for the grasps
within the friction cone seems odd. Please briefly comment
on why, intuitively, we see this peak. In other words, why
is violating the friction cone condition a good indicator
of low robustness, but adhering to the friction cone
condition is NOT a good predictor of success.
ï In the uncertainty model, please comment on why the
object shape is assumed to be known exactly. In practice,
wouldnít this also be visually estimated with some
uncertainty?
ï In Sect. IV-A, ìpî does not appear to be defined. Also,
please break down the 289 dimensions of F for completeness.
ï In Sect. IV-A, in order to claim that MC sampling can be
used as ìground truth,î it is imperative that the
uncertainty (of a mean of 100 samples) be discussed. If
this cannot be computed analytically, just compute the
variance from many 100-sample sets for some representative
grasp.
ï In Sect. IV-D, choosing the BIDMach regression
granularity as 0.01 ìto match the input granularityî seems
totally arbitrary. Is there a better justification for this
quantization? If it truly is arbitrary, just say so.
ï Figure 9 (bottom) is not really an intuitive way to
present the absolute residuals. At the very least, overlay
a ìmean absolute residualî curve as well as a ìmean
absolute residualî for uniformly random Robustness
estimates. But there may be a better plot to show this
information in a more intuitive way.
ï ìRFî is not defined at its first use in Sect. II-D.
Reviewer 2 of SIMPAR 2016 submission 58
Comments to the author
======================
This paper presents an approach for estimating grasp
robustness using local surface patch information and
supervised learning with random forests and deep neural
networks. The authors use a large dataset of 1.66 million
local patches and compare results of the both methods, and
fixed constant and angle-based baselines. A grasp
perturbation model is defined to test robustness of the
grasps against changes in contact poses. The grasp
stability prediction goal is to estimate the expected
force-closure given the local contact features. The authors
evaluate mean absolute errors and AUC measures and show
that both, random forests and DNN approaches can learn a
reliable prediction, with DNNs achieving a smaller error
and random forests being more computationally efficient.
Pros:
The paper presents an extensive evaluation of the described
methods in simulation. The results show that presented
methods outperform the baselines and provide fast estimates
of the grasp stability. Furthermore, the results clearly
indicate that approaches that are able to scale with large
amounts of data, such as random forests and deep learning,
give better results than methods such as SVMs, linear or
logistic regression.
Criticism:
a) Can the authors elaborate on how their approach would
work on a real robot? What are the limitations of their
method when applied to a real robot system?
b) It would be interesting to see how the hereby-presented
approach compares to Kappler et al. [27] when applied to
the same dataset.
c) In the related work section, the authors claim to use a
cloud robotics approach. However, in the experimental
section the authors mention that only a single machine with
two GPUs is used. Please clarify the connection to cloud
robotics.
d) There is a number of works that consider grasp stability
estimation using local information from tactile sensing
(e.g. works by H. Dang and Y. Bekiroglu). Please add
references to them.