Date: 2005-01-20 Author: Robin Sabhnani Purpose: examples to run usda project ----------- data files ----------- data.csv : add case records so far from USDA actual.csv : known pairs of similar or dissimilar cases --------------- sample scripts --------------- all.sh : uses all features with predefined weights and feature types from USDA data subset.sh : uses subset of features from USDA data ---------------- example commands ---------------- # use variable method to define whether to run normalized # weight version or un-normalized # method 1 = un-normalized # method 0 = normalized ./all.sh sim method 1 ./subset.sh sim method 0 # by default for every query case, similarity measure with each other # case in the past is generated. use threshold ( don't print anything # less than threshold ) and top ( show only top n cases ) variable to # prune the total number of pairs generated. ./all.sh sim method 0 threshold 0.6 top 10 # use deg_1 and deg_2 variables to change cost slope for false alarms at ROC # values in degrees ( note deg_1 > deg_2 ) defaults: 60, 30 ./subset.sh sim method 1 threshold 0.5 deg_1 80 deg_2 10