Archives

  • 2019-10
  • 2019-11
  • 2020-03
  • 2020-07
  • 2020-08
  • Effect of mutations on mRNA

    2020-08-28

    Effect of mutations on mRNA network Mutation data and transcriptional network
    iPAC [17] Copy number alterations GE and mutation data
    MDPFinder [16] Mutual exclusivity of gene modules GE and mutation data
    MeMo [10] Genes correlation Mutation data and network
    MSEA [11] Combination of data associated to Pathways and networks
    disease
    MutsigCV [15] Frequency of mutations and spectrum GE and exome sequence
    NetBox [14] Functional modules in cellular networks Mutation data and network
    OncodriveCLUST [8] Somatic mutation clustering Mutation data
    OncodriveFM [7] Functional mutations impact on gene Somatic variants data
    Simon [6] Mutations impact protein function Mutation data
    In the next step, we used iMaxDriver for predicting CDGs, and then, for iMaxDriver and the other fifteen methods, we assessed the accuracy of the predicted CDGs by comparing each list with the Cancer Gene Census (CGC) [29] gene list, as the gold standard (available from https://cancer.sanger.ac.uk/census). Next, the IM approach was independently applied on each of the three GE datasets (see Subsection 3.1) to compute the influence of each TF in the network. The results of iMaxDriver are provided for each cancer type as a list of potential CDGs sorted by their influence (coverage count) in descending order. Subsequently, by discretizing the results based on a threshold value
    ACCEPTED MANUSCRIPT
    we classified the Brefeldin A either as CDG or non-CDG. For fine-tuning the threshold value used for binary classification, we exploited pROC [30] package in R.
    The F-measure is a prevalent measure for evaluating the classifiers and is a good measure considering both of precision and recall measures. F-measure is mean of precision and recall in harmonic manner and defined as the following:
    where the precision is defined as the following:
    and recall is defined as the following:
    In the above equations, TP stands for the number of true positive, FP stands for the number of false positive and FN stands for number of false negative items. We will use F- measure as classification quality measure for evaluation of the iMaxDriver.
    3. Results
    We weighted the modified TRN using the GE data of three cancer types, including breast invasive carcinoma (BRCA), lung squamous cell carcinoma (LUSC) and colon adenocarcinoma (COAD) independently. Then, the list of predicted CDGs was generated using iMaxDriver (Supplementary Datasets S1 and S2). The iMaxDriverW could
    find 103, 143 and 113 CDGs for BRCA, LUSC and COAD, respectively. Subsequently, the iMaxDriverN can find 88 driver genes in each of BRCA and LUSC tissues and 90 driver genes in COAD. We evaluated our method and the other fifteen methods using cancer gene census (CGC) and functionally validated driver genes provided in [31] by Kumar et al. and gathered the results for BRCA, LUSC and COAD in Table 3. In each of the tissue types and validation datasets, top three methods with the best prediction results is shown bold. In all of the tissue types, the
    iMaxDriverw is one of the top three methods with the best results. Moreover, the iMaxDriverN in BRCA and LUSC tissue types is one of the top three methods.
    ACCEPTED MANUSCRIPT
    Table 3 The evaluation of the iMaxDriver and the other methods using CGC and Kumar datasets
    BRCA
    LUSC
    COAD
    Number of Fraction of Number of Fraction of Number of Fraction of
    Method Name predicted predicted predicted
    predicted drivers predicted drivers predicted drivers
    drivers drivers drivers
    Validation Dataset Kumar CGC Kumar CGC Kumar CGC Kumar CGC Kumar CGC Kumar CGC
    Most CDG finding tools predict only a limited number of genes. Although some of these tools predict many CDGs in their output, they are not generally of an acceptable precision. As an example, for BRCA tissue, iPAC and MSEA predict 4821 and 855 genes as CDGs, while their precision is as small as 5.1% and 8.8%, respectively. In contrast,
    iMaxDriverW predicts 408 genes as CDGs, with a precision value of 33.3%. The binary matrix representation of the genes predicted as CDG by the methods and dendrogram of clustering result are shown for BRCA, LUSC and COAD in Fig. 4.
    ACCEPTED MANUSCRIPT
    Figure 4 Binary matrix representation of the genes predicted as CDG by the methods.
    The bar plot comparing F-measure values of the methods is shown for BRCA, LUSC and COAD in Fig. 5.
    Figure 5 The F-measure of iMaxDriver and other fifteen computational methods proposed for CDG prediction
    ACCEPTED MANUSCRIPT
    Furthermore, by comparing the list of classified genes, we can see that more than 38% of the genes classified as