Skip to content
This repository was archived by the owner on Jan 6, 2021. It is now read-only.

Commit f1acaf6

Browse files
committed
tumor only module improvement
1 parent de3edb7 commit f1acaf6

File tree

3 files changed

+9
-3
lines changed

3 files changed

+9
-3
lines changed

README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,11 @@
11
MSIsensor
22
===========
3-
MSIsensor is a C++ program to detect replication slippage variants at microsatellite regions, and differentiate them as somatic or germline. Given paired tumor and normal sequence data, it builds a distribution for expected (normal) and observed (tumor) lengths of repeated sequence per microsatellite, and compares them using Pearson's Chi-Squared Test. Comprehensive testing indicates MSIsensor is an efficient and effective tool for deriving MSI status from standard tumor-normal paired sequence data. Since there are many users complained that they don't have paired normal sequence data or related normal sequence data can be used to build a paired normal control, we released MSIsensor with version from 0.3. Given tumor only sequence data, it uses comentropy theory and figures out a comentropy value for a distribution per microsatellite. Our test results show that it's performance is comparable with paired tumor and normal sequence data input(figure below). We suggest msi score cutoff 11% for tumor only data. (msi high: msi score >= 11%).
3+
MSIsensor is a C++ program to detect replication slippage variants at microsatellite regions, and differentiate them as somatic or germline. Given paired tumor and normal sequence data, it builds a distribution for expected (normal) and observed (tumor) lengths of repeated sequence per microsatellite, and compares them using Pearson's Chi-Squared Test. Comprehensive testing indicates MSIsensor is an efficient and effective tool for deriving MSI status from standard tumor-normal paired sequence data. Since there are many users complained that they don't have paired normal sequence data or related normal sequence data can be used to build a paired normal control, we released MSIsensor with version from 0.3. Given tumor only sequence data, it uses comentropy theory and figures out a comentropy value for a distribution per microsatellite. Our test results show that it's performance is comparable with paired tumor and normal sequence data input(figure below). And, We recommend to set different msi score cutoff values for different cancer types. (for example: TCGA UCEC, msi high: msi score >= 13%). We also provide test result of TCGA and EGA data.(see AUC figures below)
44

55
![](https://github.com/ding-lab/msisensor/blob/master/test/tumor_only_vs_pair.jpg)
66

7+
![](https://github.com/ding-lab/msisensor/blob/master/test/msisensor-tumor-only.png)
8+
79
If you used this tool for your work, please cite [PMID 24371154](https://www.ncbi.nlm.nih.gov/pubmed/24371154)
810

911
Beifang Niu*, Kai Ye*, Qunyuan Zhang, Charles Lu, Mingchao Xie, Michael D. McLellan, Michael C. Wendl and Li Ding#.MSIsensor: microsatellite instability detection using paired tu-mor-normal sequence data. Bioinformatics 30, 1015–1016 (2014).

homo.cpp

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -316,7 +316,8 @@ void HomoSite::DisTumorSomatic(Sample &sample) {
316316
withSufCov = false;
317317
comentropy = 0;
318318
}
319-
if (comentropy >= paramd.comentropyThreshold) {
319+
//if (comentropy >= paramd.comentropyThreshold) {
320+
if (withSufCov and comentropy >= paramd.comentropyThreshold) {
320321
reportSomatic = true;
321322
sample.numberOfMsiDataPoints ++;
322323
}
@@ -416,6 +417,7 @@ double HomoSite::DistanceBetweenTwo( unsigned short * FirstOriginal, unsigned sh
416417
double HomoSite::Comentropy( unsigned short * tumorDis, unsigned int dispots ) {
417418
double sum = 0;
418419
double comentropy = 0.0;
420+
double number = 0.0;
419421
for (int i = 0; i < dispots; i++){
420422
if (tumorDis[i] <3){
421423
tumorDis[i] = 0;
@@ -425,10 +427,12 @@ double HomoSite::Comentropy( unsigned short * tumorDis, unsigned int dispots ) {
425427
if ( sum != 0 ) {
426428
for( int j = 0; j < dispots; j++){
427429
if( tumorDis[j] != 0 ){
430+
number += 1;
428431
comentropy -= (tumorDis[j]/sum)*log(tumorDis[j]/sum);
429432
}
430433
}
431434
}
435+
if ( number != 0 ) comentropy = comentropy/number;
432436
return comentropy;
433437
}
434438

param.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ Param::Param()
103103
, covCutoff( 20 )
104104
, Normalization(0)
105105
, fdrThreshold( 0.05 )
106-
, comentropyThreshold( 1 ) // for tumor only
106+
, comentropyThreshold( 0.3 ) // for tumor only
107107
{
108108
inital_homo_phabet();
109109
initalphabet();

0 commit comments

Comments
 (0)