USeq Changelog

What's new in USeq 8.9.6

Sep 26, 2015
  • Sam2USeq:
  • Modified the per region spreadsheet to show the FracBPs>= xxx to be defined by the users minimum coverage setting.
  • Now gzipping pass and fail bed file outputs
  • MatchMates:
  • New app for joining second of pair alignments to their first of pair attributes fields. Use with FastqBarcodeTagger and Consensus.
  • Consensus:
  • New app that clusters alignments by position and molecular barcode then calls consensus on the clustered alignments outputing fastq for realignment.

New in USeq 8.9.5 (Sep 26, 2015)

  • Sam2USeq:
  • Added a failed bed region output for those target bases with less than the indicated depth. Good for identifying holes in particular locations, and deletions
  • Sam2USeq, CalculatePerCycleErrorRate, SamAlignmentExtractor, MergePairedAlignments:
  • Added option to export key QC metrics in json format
  • Moved base summary stats to MergePairedAlignments to get a more accurate uniOb count for total, Q20, and Q30
  • MergeRegions:
  • Fixed issue of running this on java 1.7+ with contract sort violation
  • MpileupParser:
  • New App for parsing a SAMTools mpileup output file for non reference bases generating PointData for the reference, non reference, and fraction non reference for bases that pass the minimum read coverage filter
  • Histogram:
  • Changed counters to use long instead of int to avoid overflow, this affects ~10 apps
  • MultiSampleVCFFilter:
  • Fixed an issue with filtering records that contain no genotype quality GQ field. These were being failed and removed
  • FastqBarcodeTagger:
  • New app for adding barcode reads from a third fastq file to paired fastq file headers, supports interlaced output for direct piping into downstream apps (recommended)

New in USeq 8.9.4 (Sep 26, 2015)

  • SamAlignmentExtractor:
  • Rewrote the app to enable processing deep coverage datasets without exceeding memory requirements. Added lots of QC metrics. BamBlaster
  • New app for injecting snv and indel variants from a vcf file into a bam file BamMixer
  • New app for mixing BamBlaster alignments into different frequency tumor samples
  • VCFMutationMaker:
  • New app to generate vcf files with random snvs and indels over target regions for BamBlaster

New in USeq 8.9.3 (Jun 25, 2015)

  • Bed2UCSCRefFlat
  • New app for converting a bed file to a multi exonic UCSC RefFlat file
  • Sam2USeq
  • Fixed an issue where chromData from the MergePairedAlignments app fail to cal read coverage with b37 no "chr" data. Now must match the bed file with the alignments

New in USeq 8.9.2 (Apr 16, 2015)

  • PoReCNV:
  • New app for detecting CNV variants in large sets of exon or gene capture panels.

New in USeq 8.9.1 (Apr 16, 2015)

  • CollectBamStats:
  • Fixed an issue with setting thresholds for flagging datasets.
  • MergePairedAlignments:
  • Fixed an issue when merging bam files with recalibrated base scores. GATK is boosting their base score values past spec.
  • CalculatePerCycleErrorRate:
  • Dropped support for unsorted sam.gz files, this was causing issues with files that had no phiX.

New in USeq 8.9.0 (Apr 16, 2015)

  • MethylationArrayScanner:
  • Added option to generate coefficient of variation bar files for both visualization and analysis. Use them in second run to find windows with significant extra variation. Note, use the PermFDR, not the Wilcoxon since the latter needs many data points.
  • VCFMerger:
  • New app for merging VCF files with same sample names into one. Hashes header to collapse. Those with the same ID are also dropped. Will not work with downstream apps that cannot process mixed INFO and FORMAT records.
  • MuTechVCFParser:
  • Reversing the parsed output Tumor Normal columns for compatibility with SomaticSniper and Strelka

New in USeq 8.8.9 (Feb 25, 2015)

  • FilterIntersectingRegions:
  • Added a max gap option and support for splitting ucsc ref flat files.
  • MergeOverlappingGenes:
  • Merges transcript models that share a minimum fraction exonic bps. Good for collapsing Cufflinks transcripts.
  • VCFComparator:
  • More modifications to support bed format variant info for somatic key test analysis
  • MultiSampleVCFFilter:
  • Fixed issue with processing gzipped vcf files when a tabix exe isn't provided
  • SomaticSniperVCFParser:
  • New app for inserting quality score into QUAL field and option for score filtering
  • VCFNoCallFilter:
  • New app for removing vcf records where too few background samples pass QC or are no calls. Better than filtering by capture target region.
  • MuTechVCFParser, SomaticSniperVCFParser, StrelkaVCFParser:
  • New apps for manipulating these flavors of vcf files

New in USeq 8.8.8 (Dec 19, 2014)

  • VCFComparator:
  • Major bug! The filter for selecting vcf variants for those in common regions was inadvertently disabled. Ugg! To findout if your prior analysis as affected, look at the # of pre and post filtered variants these should differ, if not discard! Drat!

New in USeq 8.8.7 (Dec 16, 2014)

  • ReadCoverageParser:
  • Renamed to CollectBamStats
  • Added a bunch of functionality to collect both read coverage data and alignment stats from the MergePairedAlignments and Sam2USeq apps
  • MergePairedAlignments
  • Threaded app, generates ChromData for direct import into Sam2USeq; 3x faster overall.
  • DefinedRegionDifferentialSeq
  • Updated DESeq2 scripts to support latest rlog method names
  • Added option to disable independent filtering
  • Several apps dependent on the POI Excel library
  • Updated the POI classes to work, was causing several apps to error out on launch.
  • VarScanVCFParser
  • Fixed an issue where the ssc score wasn't getting copied into the QUAL field

New in USeq 8.8.6 (Dec 5, 2014)

  • BisSeq:
  • Added another catch for missing stranded datasets from targeted capture experiments that were throwing null pointer errors.
  • BisStatRegionMaker:
  • Added option to set R path.
  • Sam2USeq:
  • Added option to output base level read coverage PointData in bar format for AggregatePlotter
  • CalculatePerCycleErrorRate:
  • Added option to process 1st and 2nd reads separately
  • ReadCoverageParser:
  • Parses output of Sam2USeq to plot many read coverage plots and flag samples that fail set thresholds for coverage
  • MergePairedAlignments:
  • New app focused on H1K QC. Works with either queryname or coordinate sorted bams.
  • VCFComparator:
  • Added option to use a bed file of key variants instead of a vcf file to allow wiggle in BamSurgeon generated variants

New in USeq 8.8.5 (Oct 31, 2014)

  • MultiSampleVCFFilter:
  • Added option to filter by region
  • Cleaned up chunking issues when no chunking is indicated
  • VarScanVCFParser:
  • New app to extract SOMATIC calls and replace QUAL score with the ssc score.
  • SamAlignmentExtractor:
  • Added option to output alignments that don't intersect the regions.
  • BamSurgeonMutator:
  • Generates random mutations in a list of regions for the BamSurgeon application
  • BisStat:
  • Fixed an issue where some runs were throwing a comparator contract sort error due to lack of data in a particular base context and thus null fraction values

New in USeq 8.8.4 (Oct 14, 2014)

  • MergeSams
  • Created a new app to merge sam and bam files. Creates a stripped header from the files if one isn't provided. This won't play nicely with downstream GATK or Picard apps. Have yet to find at good app for doing this, SamTools and Picard's are riddled with error/ validation headaches.
  • DefinedRegionDifferentialSeq
  • Added catch for alignments that are run off the end of the last reported base in a chromosome
  • SamSplitter
  • New app for splitting a sam file in half. Randomly assigns paired read groups to either half.
  • VCFComparator
  • Removed requirement that read depths are provided for the alleles. Added a skip for finding common regions when the same bed file is provided.

New in USeq 8.8.3 (Sep 19, 2014)

  • VCFComparator:
  • Added option to ignore the alt comparison when scoring whether the test and key match. Thus just scores the position.
  • Fixed an issue where SNP's with two alternates were called as a non SNP.
  • Telescriptor
  • Added new app to score two transcriptomes for possible telescripting and 3' UTR changes DefinedRegionDifferentialSeq
  • Added patch to fix DESeq2 rLog method call.

New in USeq 8.8.2 (Aug 7, 2014)

  • ReferenceMutator
  • New app that takes a directory of fasta chromosome sequence files and converts the reference allele to the alternate provided by a snp mapping table
  • SamComparator
  • New app that compares two sam/bam files and splits those into matching and non matching based on coordinates. Good for allele specific expression analysis
  • DefinedRegionDifferentialSeq
  • Added option to collect counts from the 5' and 3' ends of genes
  • DifferentialReadCoverageComparator
  • Takes the count table generated by DRDS with the -z option and looks for changes in 5'/3' read coverage between different conditions to identify short truncated transcripts
  • RandomMutationGenerator
  • Creates snvs and indels for the BAMSurgeon application

New in USeq 8.8.1 (May 23, 2014)

  • VCFSpliceAnnotator:
  • Relaxed the thresholds for calling a novel or damaging an existing splice junction ChIPSeq, RNASeq, MultipleReplicaScanSeqs
  • Updated each to utilize the new DESeq2 algorithm.

New in USeq 8.8.0 (May 8, 2014)

  • Updated DESeq to DESeq2, major changes, different dispersion fit, different log2Rto calculation, automatic independence filtering. All results in more diff expressed genes compared to the older depreciated methods. Old diff genes are pretty much a subset of new diff genes.

New in USeq 8.7.9 (Apr 18, 2014)

  • RNAEditingScanSeqs and DefinedRegionRNAEditing:
  • Added option to exclude base observations where the editing was > 0 and supported by only one read
  • DefinedRegionDifferentialSeq:
  • Added a requirement that exons included in differential splicing have 10 or more reads in both t and c samples
  • Added a relative log2Rto diff exon splicing graph for each comparison
  • Exported the coordinates of the exon with the biggest diff log2Rto instead of just the index

New in USeq 8.7.8 (Mar 11, 2014)

  • Added fix for single stranded lambda datasets, these were crashing the app when calculating the non-conversion rate.
  • AllelicMethylationDetector
  • Added fix for single stranded datasets, this was tossing chromosomes of data that didn't have both a plus and minus strand.
  • VCFComparator
  • Added a catch for GQ scores that are floats instead of ints, was causing app to crash.

New in USeq 8.7.7 (Mar 6, 2014)

  • PullMatchingAlignments:
  • New app that writes out alignments that match read names contained in a second file. The output is the alignment in sam format,
  • preceded by true/false, depending on alignment status.
  • ScoreSequences:
  • Added two additional columns: 1) list of motif locations within the read and 2) top-scoring motif location within the read.
  • VCFSpliceAnnotator:
  • Bug fix for vcf output where vcf records with no changes were being truncated
  • VCFComparator
  • Added catch for cases where no vcf records made it through the filtering. These were crashing the application with some datasets.

New in USeq 8.7.6 (Mar 6, 2014)

  • SamTranscriptomeParser & DefinedRegionDifferentialSeq:
  • Replaced the IH tag with NH tag for reporting the number of alignments present per read to be compatible with DEXSeq's HTSeq app.
  • SamReadDepthSubSampler:
  • New app for reducing extreme read depths in amplicon based sequencing datasets
  • VCFSpliceAnnotator
  • Converted max ent scan scores to z-scores

New in USeq 8.7.5 (Mar 6, 2014)

  • DRDSAnnotator:
  • Initial Check-in of DRDSAnnotator application
  • SamSVFilter and SamSVJoiner
  • Two new apps for processing alignments used in detecting structural variation detection
  • ScanSeqs:
  • Fix for FDR graph tracks where in some cases all were shown as negative
  • DefinedRegionDifferentialSeq
  • Added option to run SamSeq instead of DESeq
  • VCFSpliceAnnotator
  • Added vcf output and sj annotations
  • Multiple testing correction for pvalues

New in USeq 8.7.4 (Jan 10, 2014)

  • MiRNACorrelator:
  • Added a catch to skip calculating stats on bins with only one value
  • Fixed bug with the ordering of the bins in the ggplot
  • NovoalignBisulfiteParser:
  • Added an option to first call Picard's SortSam and MarkDuplicates
  • BisSeqAggregatePlotter:
  • Fixed a bug that would result in no output data. This bug only occurs when the first encountered chromosome has no data in any of it's regions
  • Gr2Bar
  • Users can now specify orientation when running this app.
  • MicrosatelliteCounter:
  • Initial check-in of Microsatellite counter application.
  • VCFSpliceAnnotator:
  • New app for scoring vcf files for gain or loss of splice junctions using the MaxEntScan algorithms.

New in USeq 8.7.3 (Dec 21, 2013)

  • DefinedRegionDifferentialSeq:
  • Changed the default minimum read count threshold to 20 from 10. This lower number was allowing too many low hit genes into the multiple testing correct and causing the FDRs to be significantly lower than necessary.
  • VCFComparator:
  • Added checks to convert 0|1 genotypes to 0/1, and 1/0 calls to 0/1; these are needed to standardize the calls and enable genotype matching with the NIST key
  • SamTranscriptomeParser:
  • Bug fix for merging paired alignments where an insertion occurred immediately before the start of the second read
  • MultiSampleVCFFilter:
  • Fixed the behavior of the -M flag, wasn't intersecting the proper samples
  • VCFSample:
  • The GATK HaplotypeCaller reports InDels with no coverage depth, which would break USeq VCF applications. If the sample has no coverage depth '.', the sample is no marked as a 'no call'
  • SamSVFilter:
  • New application to filter name sorted, novo raw output, for alignments indicative of structural variation

New in USeq 8.7.2 (Dec 10, 2013)

  • RNAEditingScanSeqs & DefinedRegionRNAEditing:
  • Added ability to set the minimum base read coverage threshold, defaults to 5 alignments
  • DefinedRegionDifferentialSeq:
  • Fixed an issue with recognizing a failed glm fit error message that was being inappropriately ignored leading to reduced FDRs in some cases.
  • BisStat:
  • Extensive modifications to support data from only one stand, prior to these chromosomes were skipped, needed for amplicon datasets.
  • ScoreMethylatedRegions:
  • Modifications to support datasets lacking whole strands, needed for amplicon data.
  • DefinedRegionBisSeq:
  • Modifications to support datasets lacking whole strands, needed for amplicon data.
  • BisSeq:
  • Modifications to support datasets lacking whole strands, needed for amplicon data.

New in USeq 8.7.1 (Nov 28, 2013)

  • DefinedRegionDifferentialSeq:
  • Fixed bug when DESeq was outputting FDRs with "NA", these were being assigned the max FDR, switched it to 0.
  • Note, this won't effect most analysis since these genes also have a log2 of 0 and wouldn't survive standard log2Ratio thresholding.
  • NovoalignBisulfiteParser:
  • Fixed a bug in the -b4 novoalignment parsing
  • Remove support for native novoalign format, just sam/ bam now.

New in USeq 8.7.0 (Nov 21, 2013)

  • SRAProcessor:
  • Added option to set quality score offset to 64, needed for some Illumina datasets
  • MiRNACorrelator:
  • Added ggplot2 box whisker plot functionality
  • MaxEntScan Score3 and Score5:
  • Java implementation of Yeo and Burge's Max Ent Scan algorithms for human splice site detection
  • NovoalignBisulfiteParser:
  • Added functionality for processing -b4 novoalignments
  • Enabled merging of paired overlapping reads for sam formatted alignments

New in USeq 8.6.9 (Nov 5, 2013)

  • VCFSample:
  • Changed how it recognizes a no call to anything that starts with ./. so these are skipped
  • DefinedRegionDifferentialSeq:
  • Added a System.setProperty("java.awt.headless", "true") to allow running app in non x11 mode.
  • FilterIntersectingRegions:
  • Added option to split gtf/ gff files

New in USeq 8.6.8 (Oct 16, 2013)

  • MethylationArrayScanner:
  • Removed a requirement that non paired analysis contain matching t vc samples
  • CHPCAligner:
  • Modified to work with the kingspeak cluster
  • MiRNACorrelator:
  • New application for associating changes in miRNA and gene expression values
  • USeq2UCSCBig:
  • Added catch to rename broken useq archives so these are skipped in subsequent autoconversion

New in USeq 8.6.7 (Oct 16, 2013)

  • ExportExons:
  • Added name score and strand to the output
  • DefinedRegionDifferentialSeq:
  • Fixed a bug where genes with excessive counts were still being included in the
  • DESeq analysis:
  • Added a check for splice analysis with one gene
  • USeq2UCSCBig:
  • Added a sandbox to the UCSC executables to limit memory and time on linux systems
  • SamTranscriptomeParser:
  • Added another check for malformed sam records

New in USeq 8.6.6 (Oct 16, 2013)

  • DefinedRegionDifferentialSeq:
  • Fixed an error where improperly paired reads weren't counted towards a gene's coverage. This only affected stranded analysis.
  • Now, the -j and -p options work properly on stranded analysis
  • Fixed issue with running a no replica multiple condition analysis where DESeq was exiting with an error.
  • ScoreSequences:
  • Modified output to table format, including score summary and each hit

New in USeq 8.6.5 (Aug 22, 2013)

  • EnrichedRegionMaker:
  • Enabled the app to work with .swi, .swi.gz, and .swi.zip files
  • BisStat & BisSeq:
  • Autoconverting bar directories to useq archives
  • DefinedRegionDifferentialSeq:
  • Major rewrite to incorporate all sample normalization into DESeq
  • Added direct formated output to actual Excel xlsx document

New in USeq 8.6.4 (Jul 18, 2013)

  • DefinedRegionDifferentialSeq
  • Added internal check to remove edgeR analysis and proceed if the app throws errors.
  • SamSubsampler
  • New app to subsample a sam/bam file after filtering and randomizing. Needed for comparing read coverage graphs.
  • BisStat
  • Added option to export base level log 2 ratio of fraction methylation in T vs C graphs.
  • Renamed graph folders to indicate Base or Window level data

New in USeq 8.6.3 (Jul 3, 2013)

  • MethylationArrayDefinedRegionScanner
  • Modified app so it works with non even numbers of t and c SamParser
  • Modified the way the mid point of an alignment is calculate. It is now the alignment start + 1/2 the read length. Did this to avoid problems with splice junction reads.
  • Set the bam file reader stringency to silent.

New in USeq 8.6.2 (Jun 25, 2013)

  • RNAEditingScanSeqs
  • Added a minimum base fraction edited option to app to restrict what base observations are allowed into the analysis.
  • RNAEditingPileUpParser
  • Fixed a bug where the parser was counting all of the reference reads (both plus and minus) when scoring a base for editing instead of just reads mapping to the matched strand when a stranded parsing was selected.

New in USeq 8.6.1 (Jun 21, 2013)

  • RandomizeTextFile
  • Added ability to process all files in a directory.
  • Gzipping output.
  • FetchGenomicSequences
  • Enabled working with gzipped fasta data
  • DefinedRegionDifferentialSeq
  • Added ANOVA-like edgeR analysis to the output for runs with more than 2 conditions.

New in USeq 8.6.0 (Jun 3, 2013)

  • NovoalignBisulfiteParser
  • Added option to parse xxx.bam files
  • Removed -u unique alignment option since this only works with native novo format and was confusing the sam folks
  • SamTranscriptomeParser
  • Fixed a bug with merging paired RNASeq datasets that was causing the strand to be incorrectly assigned to the merged alignment.

New in USeq 8.5.8 (May 18, 2013)

  • SamTranscriptomeParser
  • Modified app to gzip temp files and when a header is provided to directly write out the results without an intermediate file. Should cut down on disk usage.
  • NovoalignBisulfiteParser
  • Reduced the default thresholds for minimum base quality to 13. New bis-seq data is showing reduced qualities (probably due to different calibration) causing some datasets to show a significant reduction in data.
  • Added a fraction bases passing base quality to report out this statistic and flag such cases.
  • VCFComparator
  • Added a new spreadsheet of just dFDR and TPR for each sample when more than one to make graphing easier.

New in USeq 8.5.7 (May 18, 2013)

  • MultiSampleVCFFilter
  • Added option to filter on VQSLOD scores
  • MethylationArrayScanSeqs
  • New app for identifying DMRs from array based methylation assays. Will work for other types of array data too.

New in USeq 8.5.6 (Apr 19, 2013)

  • DefinedRegionRNAEditing
  • Added option to perform stranded analysis.

New in USeq 8.5.5 (Apr 18, 2013)

  • New app that cleans up big old files in user's Tomato job directories
  • Sam2USeq
  • Added mean, median, min, max coverage stats for users defined regions
  • VCFComparator
  • New app for comparing a key of trusted variant calls to a test vcf file. Generates stats for ROC curves and allows unambiguous comparisons between processing pipelines.
  • RNAEditingScanSeqs
  • Added option to perform stranded analysis.

New in USeq 8.5.4 (Mar 26, 2013)

  • MultiSampleVCFFilter
  • New app for splitting multiple sample vcf file(s) records into those that pass or fail a variety of conditions and sample level thresholds
  • DefinedRegionDifferentialSeq
  • Set the validation strigency on the Picard SAMReaders to silent

New in USeq 8.5.3 (Mar 26, 2013)

  • DefinedRegionRNAEditing
  • New app for scoring user defined regions for RNAEditing, similar stats to RNAEditingScanSeqs

New in USeq 8.5.2 (Mar 26, 2013)

  • RNAEditingScanSeqs
  • Added FDR estimation for clustered edited sites
  • AggregatePlotter
  • Added check for regions calling for non existent point data

New in USeq 8.5.1 (Mar 26, 2013)

  • MergePairedSamAlignments
  • Fixed bug when merging bam files where double line returns where causing Picard's SortSam to error out.
  • VCFTabix
  • New app for recursing through directories and tabix indexing vcf files
  • USeq2UCSCBig and UCSCBig2USeq
  • Added new options for skipping for forcing overwrite of already converted files
  • Added options to silence all but error messages
  • CalculatePerCycleErrorRate
  • Added ability to work with unsorted sam files
  • Added option to require read names start with a given prefix. Novoalign is adding in some junk reads to their output?
  • MergeUCSCGeneTable
  • Fixed bug where 1bp terminal exons were being skipped causing the conversion of bed12 useq files to bb to throw an error.
  • SamTranscriptomeParser
  • Fixed bug where insertions or deletions that occurred at the splice junction were causing the failure to insert an appropriate number of Ns

New in USeq 8.5.0 (Mar 26, 2013)

  • CalculatePerCycleErrorRate
  • New app for calculating the per cycle error rate from phiX alignments.
  • MergePairedSamAlignments
  • Modified so that it doesn't merge phiX alignments so that the CalculatePerCycleErrorRate app can be used on merged data.
  • Removes chrAdapt* alignments automatically, no option to not remove. These are saved to the bad alignment file.