Data file and data structure documentation for the data used in Science paper. Tom Mitchell Sept, 2009 fMRI data is available for a number of human subjects. For each subject the full data set is stored on a file containing the subject number (e.g., the data for subject P1 is on data-science-P1.mat). After you load a file, you will find three variables defined: info, data, and meta. The variable 'meta' contains general information about the dataset. The variable 'info' describes information about each presentation trial. The variable 'data' contains the actual image intensity data values. Detailed documentation for each variable is provided below: =========================================== META meta: This variable provides information about the data set. Relevant fields are shown in the following example: meta = study: 'science' subject: 'P1' ntrials: 360 nvoxels: 21764 dimx: 51 dimy: 61 dimz: 23 colToCoord: [21764x3 double] coordToCol: [51x61x23 double] meta.study gives the name of the fMRI study. meta.subject gives the identifier for the human subject. meta.ntrials gives the number of trials in this dataset. meta.nvoxels gives the number of voxels (3D pixels) in each image. meta.dimx gives the maximum x coordinate in the brain image. The minimum x coordinate is x=1. meta.dimy and meta.dimz give the same information for the y and z coordinates. meta.colToCoord(v,:) gives the geometric coordinate (x,y,z) of the voxel corresponding to column v in the data. meta.coordToCol(x,y,z) gives the column index (within the data) of the voxel whose coordinate is (x,y,z). =========================================== INFO info: This variable defines the experiment in terms of a sequence of 'trials'. 'info' is a 1x360 struct array, describing the 360 trials. The relevant fields of info are illustrated in the following example: info(50) = cond: 'tool' cond_number: 11 word: 'hammer' word_number: 2 epoch: 1 info.cond gives the condition (in most cases, category of the word) presented during this trial. info.cond_number gives the numeric index of the condition presented during this trial. info.cond_number ranges from 2 to 13 because there are twelve different categories. Notice that the indexing started at 2 because cond_number=1 was used to indicate a rest, or fixation interval. In the provided example, the cond is 11 because 'tool' is the tenth category. info.word gives the word presented during this trial. For instance, the word 'hammer' is presented during this trial. info.word_number gives the numeric index of the word presented during this trial. info.word_number ranges from 1 to 5 because there are five words per condition. For instance, the word_number=2 here because 'hammer' is the second word in the 'tool' category. info.epoch gives the number of times this word has been presented. For instance, epoch=1 denotes this is the first time the word 'hammer' is presented. =========================================== DATA data: This variable contains the raw observed data. The fMRI data is a sequence of images collected over time. The data structure 'data' is a [360x1] cell array, with one cell per 'trial' in the experiment. Each element in this cell array is an 1xV array of observed fMRI activations. The element data{x}(1,v) gives the fMRI observation at voxel v within trial x. The full image at time t within trial x is given by data{x}(1,:).