Java utilities for Next Generation Sequencing

Pierre Lindenbaum PhD


Download and install

see Download and Install


SplitBamSplit a BAM by chromosome group. Creates EMPTY bams if no reads was found for a given group.
SamJSFiltering a SAM/BAM with javascript (rhino).
VCFFilterJSFiltering a VCF with javascript (rhino)
SortVCFOnRefSort a VCF using the order of the chromosomes in a REFerence index.
IlluminadirCreate a structured (**JSON** or **XML**) representation of a directory containing some Illumina FASTQs.
BamStats04Coverage statistics for a BED file. It uses the Cigar string instead of the start/end to compute the coverage
BamStats01Statistics about the reads in a BAM.
VCFBedAnnotate a VCF with the content of a BED file indexed with tabix.
VCFPolyXNumber of repeated REF bases around POS.
VCFBigWigAnnotate a VCF with the data of a bigwig file.
VCFTabixmlAnnotate a value from a vcf+xml file.4th column of the BED indexed with TABIX is a XML string.
GroupByGeneGroup VCF data by gene/transcript.
VCFPredictionsBasic variant prediction using UCSC knownGenes.
FindCorruptedFilesReads filename from stdin and prints corrupted NGS files (VCF/BAM/FASTQ).
VCF2XMLTransforms a VCF to XML.
VCFAnnoBamAnnotate a VCF with the Coverage statistics of a BAM file + BED file of capture. It uses the Cigar string instead of the start/end to get the voverage
VCFTrioCheck for mendelian incompatibilities in a VCF.
SamGrepSearch reads in a BAM
VCFFixIndelsFix samtools INDELS for @SolenaLS
NgsFilesSummaryScan folders and generate a summary of the files (SAMPLE/BAM SAMPLE/VCF etc..).
NoZeroVariationVCFcreates a VCF containing one fake variation if the input is empty.
HowManyBamDictfor @abinouze : quickly find the number of distinct BAM Dictionaries from a set of BAM files.
ExtendBedExtends a BED file by 'X' bases.
CmpBamsCompare two or more BAMs.
IlluminaFastqStatsStatistics on Illumina Fastqs
Bam2RasterSave a BAM alignment as a PNG image.
VcfRebaseFinds restriction sites overlapping variants in a VCF file
FastqRevCompReverse complement a FATQ file for mate-pair alignment
PicardMetricsToXMLConvert picards metrics file to XML.
Bam2WigBam to Wiggle converter
TViewWebCGI/Web based version of samtools tview
VcfRegistryWebCGI/Web tool printing all the variants at a given position for a collection VCF
BlastMapAnnotsMaps uniprot/genbank annotations on a blast result. See
VcfViewGuiSimple java-Swing-based VCF viewer.
BamViewGuiSimple java-Swing-based BAM viewer.
Biostar81455Defining precisely the genomic context based on a position
MapUniProtFeaturesmap Uniprot features on reference genome.
Biostar86363Set genotype of specific sample/genotype comb to unknown in multisample vcf file.
FixVCFFix a VCF HEADER when I forgot to declare a FILTER or an INFO field in the HEADER
Biostar78400Add the read group info to the sam file on a per lane basis
Biostar78285Extract regions of genome that have 0 coverage See
Biostar77288Low resolution sequence alignment visualization
Biostar77828Divide the human genome among X cores, taking into account gaps See
Biostar76892Fix strand of two paired reads close but on the same strand
VCFCompareGTVCF : compare genotypes of two or more callers for the same samples.
SAM4WebLogoCreates an Input file for BAM + WebLogo.
SAM2TsvTabular view of each base of the reads vs the reference.
Biostar84786Table transposition
VCF2SQLGenerate the SQL code to insert a VCF into a database
Bam4DeseqIntervalscreates a table for DESEQ with the number of reads within a sliding window for multiple BAMS
VCFStripAnnotationsRemoves one or more field from the INFO column from a VCF.
VCFGeneOntologyFinds the GO terms for VCF annotated with SNPEFF or VEP
VCFFilterGOSet the VCF FILTERs on VCF files annotated with SNPEFF or VCP testing wether a Gene belong or not to the descendants of a GO term.
Biostar86480Genomic restriction finder See
BamToFastqShrink your FASTQ.bz2 files by 40+% using this one weird tip by ordering them by alignment to reference
PadEmptyFastqPad empty fastq sequence/qual with N/#
SamFixCigarReplace 'M'(match) in SAM cigar by 'X' or '='
FixVcfFormatFix PL format in VCF. Problem is described in
VcfToRdfConvert a VCF to RDF.
VcfShuffleShuffle a VCF.
DownSampleVcfDown sample a VCF.
VcfHeadPrint the first variants of a VCF.
VcfTailPrint the last variants of a VCF
VcfCutSamplesSelect/Exclude some samples from a VCF
VcfStatsGenerate some statistics from a VCF
VcfSampleRenameRename Samples in a VCF.
VcffilterSequenceOntologyFilter a VCF on Seqence Ontology (SO).
Biostar59647position of mismatches per read from a sam/bam file (XML) See
VcfRenameChromosomesRename chromosomes in a VCF (eg. convert hg19/ucsc to grch37/ensembl)
BamRenameChromosomesRename chromosomes in a BAM (eg. convert hg19/ucsc to grch37/ensembl)
BedRenameChromosomesRename chromosomes in a BED (eg. convert hg19/ucsc to grch37/ensembl)
BlastnToSnpMap variations from a BLASTN-XML file.
Blast2SamConvert a BLASTN-XML input to SAM
VcfMapUniprotMap uniprot features on VCF annotated with VEP or SNPEff.
VcfCompareCompare two VCF files.
VcfBiomartAnnotate a VCF with the data from Biomart.
VcfLiftOverLiftOver a VCF file.
BedLiftOverLiftOver a BED file.
VcfConcatConcatenate VCF files.
MergeSplittedBlastMerge Blast hit from a splitted database
FindMyVirusVirus+host cell : split BAM into categories.
Biostar90204linux split equivalent for BAM file .
VcfJasparFinds JASPAR profiles in VCF
GenomicJasparFinds JASPAR profiles in Fasta
VcfTreePackCreate a TreeMap from one or more VCF
BamTreePackCreate a TreeMap from one or more Bam.
FastqRecordTreePackCreate a TreeMap from one or more Fastq files.
WorldMapGenomeMap bed file to Genome + geographic data.
AddLinearIndexToBedUse a Sequence dictionary to create a linear index for a BED file. Can be used as a X-Axis for a chart.
VCFCommCompare mulitple VCF files, ouput a new VCF file.
VcfInPrints variants that are contained/not contained into another VCF
Biostar92368Binary interactions depth See also
FastqGrepFinds reads in fastq files
VcfCaddAnnotate a VCF with Combined Annotation Dependent Depletion (CADD) data.
SortVCFOnInfosort a VCF using a field in the INFO column
GCAndDepthExtracts GC% and depth for multiple bam using a sliding window.
Biostar94573Getting a VCF file from a CLUSTAW or FASTA alignment
CompareBamAndBuildCompare two BAM files mapped on two different builds. Requires a liftover chain file.
KnownGenesToBedConvert UCSC KnownGene to BED.
Biostar95652Drawing a schematic genomic context tree. See also
SamToPslConvert SAM/BAM to PSL or BED12 .
BWAMemNOpmerge the SA:Z:* attributes of a read mapped with bwa-mem and prints a read containing a cigar string with 'N' (Skipped region from the REF).
FastqEntropyCompute the Entropy of a Fastq file (distribution of the length(gzipped(sequence)))
NgsFilesScannerBuild a persistent database of NGS file. Dump as XML.


