Daria Sorokina

Publications



Refereed journal, conference and workshop papers
2009 Daria Sorokina, Rich Caruana, Mirek Riedewald, Wes Hochachka, Steve Kelling
Detecting and Interpreting Variable Interactions in Observational Ornithology Data. To appear in the International Workshop on Domain Driven Data Mining (DDDM'09).
2009 Daria Sorokina
Application of Additive Groves Ensemble with Multiple Counts Feature Evaluation to KDD Cup'09 Small Data Set. In proceedings of KDD Cup 2009 workshop.
2009 Sameer Singh, Jeremy Kubica, Scott Larsen, Daria Sorokina
Parallel Large Scale Feature Selection for Logistic Regression. In proceedings of SIAM International Conference on Data Mining (SDM'09).
2008 Daria Sorokina, Rich Caruana, Mirek Riedewald, Daniel Fink.
Detecting Statistical Interactions with Additive Groves of Trees. In proceedings of the 25th International Conference on Machine Learning (ICML'08).
2007 Daria Sorokina, Rich Caruana, Mirek Riedewald.
Additive Groves of Regression Trees. In proceedings of the 18th European Conference on Machine Learning (ECML'07). (Best Student Paper award.)
2007 W. Hochachka, R. Caruana, A. Munson, M. Riedewald, D. Sorokina, D. Fink, S. Kelling.
Data-Mining Discovery of Pattern and Process in Ecological Systems. Journal of Wildlife Management: 71(7), pp. 2427-2437.
2006 Daria Sorokina, Johannes Gehrke, Simeon Warner, Paul Ginsparg.
Plagiarism Detection in arXiv. In proceedings of the 6th IEEE International Conference on Data Mining (ICDM'06).
2006 R. Caruana, M. Elhawary, A. Munson, M. Riedewald, D. Sorokina, D. Fink, W. Hochachka, S. Kelling.
Mining Citizen Science Data to Predict Prevalence of Wild Bird Species. In proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'06).


Tech reports
2006 Daria Sorokina, Johannes Gehrke, Simeon Warner, Paul Ginsparg.
Plagiarism Detection in arXiv. Technical Report TR2006-2046, Computing and Information Science, Cornell University, 2006.
2003 Daria Sorokina, Mikhail Petrovskiy.
Adaptation of the Fuzzy Decision Tree Algorithm for Multidimensional Datacubes. Collected Articles on Software Systems and Tools, CMC MSU publishing, Moscow, Russia.
2003 Daria Erofeyeva.
Fuzzy Approach to Classification for Multidimensional Datacubes. Diplom thesis, Moscow State University.