Skip to content
This repository was archived by the owner on Jan 6, 2021. It is now read-only.

Commit 383a939

Browse files
committed
documentation improvement
1 parent 89ba63f commit 383a939

File tree

3 files changed

+96
-6
lines changed

3 files changed

+96
-6
lines changed

README.md

Lines changed: 94 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
MSIsensor
22
===========
3-
Homopolymer and microsatellite analysis using bam files
3+
MSIsensor, a c++ program for automatically detecting somatic microsatellite changes. It computes length distributions of microsatellites per site in paired tumor and normal sequence data, subsequently using these to statistically compare observed distributions in both samples. Comprehensive testing indicates MSIsensor is an efficient and effective tool for deriving MSI status from standard tumor-normal paired sequence data.
44

55
Usage
66
-----
@@ -13,8 +13,41 @@ Key commands:
1313
scan scan homopolymers and miscrosatelites
1414
msi msi scoring
1515

16-
This tool was originally designed to do homopolymer and microsatellites analysis.
17-
16+
msisensor scan [options]:
17+
18+
-d <string> reference genome sequences file, *.fasta format
19+
-o <string> output homopolymer and microsatelittes file
20+
21+
-l <int> minimal homopolymer size, default=5
22+
-c <int> context length, default=5
23+
-m <int> maximal homopolymer size, default=50
24+
-s <int> maximal length of microsate, default=5
25+
-r <int> minimal repeat times of microsate, default=3
26+
-p <int> output homopolymer only, 0: no; 1: yes, default=0
27+
28+
-h help
29+
30+
msisensor msi [options]:
31+
32+
-d <string> homopolymer and microsates file
33+
-n <string> normal bam file
34+
-t <string> tumor bam file
35+
-o <string> output distribution file
36+
37+
-e <string> bed file
38+
-r <string> choose one region, format: 1:10000000-20000000
39+
-l <int> mininal homopolymer size, default=5
40+
-p <int> mininal homopolymer size for distribution analysis, default=10
41+
-m <int> maximal homopolymer size for distribution analysis, default=50
42+
-q <int> mininal microsates size, default=3
43+
-s <int> mininal microsates size for distribution analysis, default=5
44+
-w <int> maximal microstaes size for distribution analysis, default=40
45+
-u <int> span size around window for extracting reads, default=500
46+
-b <int> threads number for parallel computing, default=1
47+
-x <int> output homopolymer only, 0: no; 1: yes, default=0
48+
-y <int> output microsatellite only, 0: no; 1: yes, default=0
49+
50+
-h help
1851

1952
Install
2053
-------
@@ -42,5 +75,62 @@ I recommend dumping it in the system directory for locally compiled packages:
4275

4376
sudo mv msisensor /usr/local/bin/
4477

45-
xxx
78+
Example
79+
-------
80+
1. Scan microsatellites from reference genome:
81+
82+
msisensor scan -d referen.fa -o microsatellites.list
83+
84+
2. Msi scorring:
85+
86+
msisensor msi -d microsatellites.list -n normal.bam -t tumor.bam -e bed.file -o output.prefix -l 1 -q 1 -b 2
87+
88+
89+
Output
90+
-------
91+
There will be one microsatellite list output in "scan" step.
92+
Msi scorring step will give 4 output files based on given output prefix:
93+
output.prefix
94+
output.prefix_dis
95+
output.prefix_germline
96+
output.prefix_somatic
97+
98+
1. microsatellites.list: microsatellite list output
99+
100+
chromosome location site_length site_content repeat_times front_flank tail_flank site_bases front_flank_bases tail_flank_bases
101+
1 10485 4 149 3 150 685 GCCC AGCCG GGGTC
102+
1 10629 2 9 3 258 409 GC CAAAG CGCGC
103+
1 10652 2 2 3 665 614 AG GGCGC GCGCG
104+
1 10658 2 9 3 546 409 GC GAGAG CGCGC
105+
1 10681 2 2 3 665 614 AG GGCGC GCGCG
106+
107+
2. output.prefix: msi score output
108+
109+
Total_Number_of_Sites Number_of_Somatic_Sites %
110+
640 75 11.72
111+
112+
3. output.prefix_dis: read count distribution (N: normal; T: tumor)
113+
114+
1 10529896 CTTTC 15[T] GAGAC
115+
N: 0 0 0 0 0 0 0 1 0 0 8 9 1 7 17 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
116+
T: 0 0 0 0 0 0 0 0 0 1 19 14 17 9 32 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
117+
118+
4. output.prefix_somatic: somatic sites detected
119+
120+
chromosome location front_flank repeat_times site_content tail_flank difference P_value
121+
1 10357206 TTGAA 17 T ACTTC 0.45670 0.00014
122+
1 11140610 TCTGG 11 A CACAC 0.80855 0.00000
123+
1 11156045 ACATC 15 T GAGAC 0.75281 0.00001
124+
1 12368705 GAGTG 15 T GAGAT 0.51139 0.00000
125+
1 16200729 TAAGA 10 T CTTGT 0.55652 0.00000
126+
1 16245610 AAGGG 10 T GCATA 0.75928 0.00000
127+
128+
5. output.prefix_germline: germline sites detected
129+
130+
chromosome location front_flank repeat_times site_content tail_flank xxx|xxxx
131+
1 1192105 AATAC 11 A TTAGC 5|5
132+
1 1330899 CTGCC 5 AG CACAG 5|5
133+
1 1598690 AATAC 12 A TTAGC 5|5
134+
1 1605407 AAAAG 14 A GAAAA 1|1
135+
1 2118724 TTTTC 11 T CTTTT 1|1
46136

distribution.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ std::ofstream foutD;
6565
std::string one_region;
6666

6767
void DisUsage(void) {
68-
std::cerr<<"\nUsage: plolyscape dis [options] \n\n"
68+
std::cerr<<"\nUsage: msisensor msi [options] \n\n"
6969
<<" -d <string> homopolymer and microsates file\n"
7070
<<" -n <string> normal bam file\n"
7171
<<" -t <string> tumor bam file\n"

scan.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ std::ifstream fin_d;
5252
std::ofstream fout;
5353

5454
void ScanUsage(void) {
55-
std::cerr<<"\nUsage: plolyscape scan [options] \n\n"
55+
std::cerr<<"\nUsage: msisensor scan [options] \n\n"
5656
<<" -d <string> reference genome sequences file, *.fasta format\n"
5757
<<" -o <string> output homopolymer and microsatelittes file\n\n"
5858
<<" -l <int> minimal homopolymer size, default="<<param.MininalHomoSize<<"\n"

0 commit comments

Comments
 (0)