One recent interest in computer-aided diagnosis of neurological diseases is to predict the clinical scores from brain images. Most existing methods usually estimate multiple clinical variables separately, without considering the useful correlation information among them. On the other hand, nearly all methods use only one modality of data (mostly structural MRI) for regression, and thus ignore the complementary information among different modalities. To address these issues, in this paper, we present a general methodology, namely Multi-Modal Multi-Task (M3T) learning, to jointly predict multiple variables from multi-modal data. Our method contains three major subsequent steps: (1) a multi-task feature selection which selects the common subset of relevant features for the related multiple clinical variables from each modality; (2) a kernel-based multimodal data fusion which fuses the above-selected features from all modalities; (3) a support vector regression which predicts multiple clinical variables based on the previously learnt mixed kernel. Experimental results on ADNI dataset with both imaging modalities (MRI and PET) and biological modality (CSF) validate the efficacy of the proposed M3T learning method.