FastQC Changelog

What's new in FastQC 0.11.3

Sep 8, 2015
  • Fixed:
  • Fixed a bug when using the limits.txt file to disable the per tile analysis module.
  • Fixed a documentation error in the duplicated sequences plot.
  • Fixed a thread safety bug when processing multiple files in a single session which caused the program not to exit when all processing had in fact completed.
  • Fixed a bug which meant that forced formats in the interactive application weren't being honoured.
  • Fixed a bug in the way soft clipping was applied when we were analysing only mapped data.
  • Fix a memory issue when trying to parse tile names in cases where we mistakenly think we're identify tile numbers, but we aren't.
  • Fix a bug in the text reporting of per-tile quality scores.
  • Add the SOLID smallRNA adapter to the default adapter search set.
  • Fix a bug in casava mode when using uncompressed fastq files.
  • Increase the number of sampled sequences in the duplicate and overrepresented module to 100,000.
  • Add a clean up of data structures for the Kmer module so that the interactive mode can process more files without dying.

New in FastQC 0.11.2 (Sep 8, 2015)

  • Fixed:
  • Added a proper implementation of a --limits command line option to allow users to specify a custom limits file for an individual run. This also fixed a bug seen if the user used the --adapter option.
  • Fixed an error in the naming of the folder inside the zip file such that it couldn't be extracted into the same folder as the main HTML file.
  • Fixed an overly large data structure which was causing some runs to terminate due to a lack of memory.
  • Fixed a poor implementation in the Kmer module which was causing unusually high memory usage.
  • Fixed incorrect defaults for the warn/fail values in the per sequence quality module.

New in FastQC 0.11.1 (Sep 8, 2015)

  • The major new features in this release are:
  • Configurable thresholds for modules. For all modules you can now alter a configuration file to set the thresholds used by the program for warnings and errors so that you can flag up only the types of problem which you are concerned by.
  • Optional modules. The same configuration file used to set the warn / error thresholds can also be used to selectively disable modules you don't want to see at all.
  • New per-tile quality analysis. If you are running Illumina libraries through FastQC it will now analyse the quality calls on a per-tile basis and will flag up points in the run where the quality in individual tiles fell below the average quality for that cycle. This can help to spot technical problems during the run.
  • New adapter content module. A new module has been added to specifically search for the presence of adapters in your library. This operates in a similar way to the existing Kmer analysis but allows you to specify individual adapter sequences to screen and will always show the results for each adapter so you can easily see what you might gain if you chose to adapter trim your library.
  • Improved duplication plot. The duplication plot has been given an overhaul so that it now reports values which are real read numbers rather than always giving relative values. It also shows how the level of duplication would affect the library
  • both before and after deduplication, and the headline figure is now much more useful as it shows the percentage of the library which would remain if you chose to deduplicate.
  • Improved Kmer module. The Kmer module has been changed so that instead of trying to search for individual Kmers which are present at higher than expected frequency (which actually happens all the time in real libraries), it now looks for Kmers which are present in significantly different amounts at different starting positions within the library. This has allowed the use of longer Kmer sequences to give a more useful result.
  • Since file reports. The default output format for the program is now a single HTML file with all of the various graphs embedded into it. The .zip file output with the individual graphs is still produced as are the associated data files, but you can just
  • distribute the one HTML file alone - the other data is no longer required.
  • Ability to read from stdin. If you want to pipe a stream of data into fastqc rather than using a real file then you can just use 'stdin' as the filename to process and then stream uncompressed fastq data on stdin.
  • Changed base groupings. For long reads we used to use an exponential series to group bases together to summarise the sequence content and qualities. We've now switched the default to be that for grouped plots the first 9 individual bases will always be shown (since this often roots out problems in the libraries), after that
  • there will be a series of evenly sized windows so that the same number of bases fall into each window. You can bring back the old behaviour with the new --expgroup option, and you can remove grouping all together with the --nogroup option.
  • Dropped support for the Solaxa64 (but NOT Phred64) encoding. Have removed the ability of the program to reliably detect the original Solexa64 encoding which was used in the GA pipeline prior to v1.3. This was a 64 offset encoding but which allowed scores which ranged down to -5. Supporting this encoding meant that we would incorrectly guess the encoding on Phred33 files which had no bases with quality scores below 26, which could happen if you aggressively trimmed your
  • data. Supporting just Phred33 and Phred64 now means that we wouldn't
  • mis-detect unless there were no bases with qualities below 31, which is much less likely, even in trimmed data. Since no Solexa64 data will have been produced since early 2009 it is unlikely that removing support for this format will adversely affect users of the program.

New in FastQC 0.10.1 (Sep 8, 2015)

  • FastQC v0.10.1 is a bugfix release which works around two problemspeople have encountered with previous releases:
  • A work-round has been put into place for a limitation in the java gzip decompressor, where it would read only the first compressed block in a file created by concatenating multiple gzipped files directly, rather than decompressing and recompressing them.
  • Users who had installed the program in directories containing characters required to be encoded in URLs (= & ? etc) were finding that the report generation generated an error. This encoding has now been fixed and the program should now have no limits in the name of the directory in which it can be installed.
  • One additional feature is that in the fastqc wrapper you can now specify the location of the java interpreter on the command line using the --java parameter, rather than having to have it included in the path.
  • One other change in this release is that the package names for all of the java classes have been changed to reflect a change in the official project URL. This means that the launchers for the program have had to be updated to use these new names. If you have created your own launcher, or had copied any of the old ones you'll need to update this to use the launchers included in this version of the program.

New in FastQC 0.10.0 (Sep 8, 2015)

  • The major feature of FastQC v0.10.0 is the addition of support for fastq files generated directly by the latest version of the illumina pipeline (Casava v1.8).
  • In this version the pipeline generates gzipped fastq files by default rather than using qseq files which can then be converted to fastq. However the fastq files generated by casava are unusual in two ways:
  • A single sample produces a set of fastq files with a common name, but an incrementing number at the end.
  • Casava FastQ files contain sequences from clusters which have failed the internal QC, and been flagged to be filtered.
  • FastQC v0.10.0 introduces a Casava mode which will merge together fastq files from the same sample group and produce a single report. It will also exclude any flagged entries from the analysis. You would therefore run FastQC as normal but selecting all of the fastq files from Casava and using casava mode for your analysis.
  • Casava mode is activated from the command line by adding the --casava option to the launch command. From the interactive application you need to select 'Casava FastQ Files' from the drop down file selector filter options.
  • If you want to analyse casava fastq files without these extra options then you can use treat them as normal fastq files with no problems.
  • In addition to this change there have also been changes to allow the wrapper script to work properly under windows, and a bug was fixed which missed of the last possible Kmer from every sequence in a library.

New in FastQC 0.9.6 (Sep 8, 2015)

  • FastQC v0.9.6 fixes a couple of bugs which aren't likely to have affected the majority of fastqc users:
  • Fixed a crash in the Kmer module when analysing a sequence where every sequence in a library contained a poly-N stretch at its 3' end.
  • Fixed the wrapper script so that OSX users launching fastqc through the script rather than the Mac application bundle get their classpath set correctly, and can therefore analyse bam/sam files.

New in FastQC 0.9.5 (Sep 8, 2015)

  • FastQC v0.9.5 fixes some bugs in the programs text output and improves a few things in the graphical interface. Main changes are:
  • Progress calculations are now exact and not approximate
  • The UI now has a welcome screen so you're not just presented with a blank screen when the program starts
  • The wrapper script now sets the classpath correctly in windows as well as linux.
  • The text report for per-base sequence content now reports correct values for grouped bases
  • The HTML report uses a custom stylesheet for print output so graphs aren't cut off when reports are printed.
  • Fixed a bug in testing for a warning in the per-base sequence content module.
  • Alt text in HTML reports now matches the graphic it describes

New in FastQC 0.9.4 (Sep 8, 2015)

  • FastQC v0.9.4 is a minor bugfix release which changes the offline version of the program so that if a file fails to be processed a full backtrace of the error is produced, rather than just a simple generic error message.

New in FastQC 0.9.3 (Sep 8, 2015)

  • FastQC v0.9.3 adds support for fastq files compressed with bzip2 in addition to its existing support for gzip compressed files. It's worth noting that although bzip2 offers a reduction in the file size of the compressed files (about a 5-fold size
  • reduction compare to raw fastq. Gzip is a 4-fold decrease),there is a significant penalty in terms of the speed of decompression of these files. In our tests gzipped files were actually processed slightly faster than raw FastQ files, presumably
  • due to the lower amount of data transfer from disk required, however bzip2 compressed files took around 6X as long to process as gzipped files.
  • The other big change in this release is an update to the default CSS layout such that viewing the HTML reports doesn't require lots of scrolling up and down the page. As before the CSS can be edited and customised by editing the templates shipped with
  • the program.

New in FastQC 0.9.2 (Sep 8, 2015)

  • FastQC v0.9.2 fixes two bugs which were identified in the previous release.
  • In the text output for the per-base quality module the correct base numbers weren't being included for files which used grouped base ranges.
  • The Kmer analysis module could crash when analysing very small files, such that no position in the file had more than 1000 instances of an enriched Kmer.
  • Both of these issues should now be resolved.

New in FastQC 0.9.1 (Sep 8, 2015)

  • FastQC v0.9.1 adds some new command line options and fixes a
  • couple of usability issues.
  • The new command line options are:
  • --quiet Will suppress all progress messages and ensure that only
  • warnings or errors are written to stderr. This might be useful
  • for people running fastqc as part of a pipeline.
  • --nogroup Will turn off the dynamic grouping of bases in the
  • various per-base plots. This would allow you to see a result
  • for each base of a 100bp run for example. This option should
  • not be used for really long reads (454, PacBio etc) since it
  • has the potential to crash the program or generate very wide
  • output plots
  • In addition to these the following changes have been made:
  • The basic stats module now includes a line to say which type
  • of quality encoding was found so this information isn't just
  • present in the header of the per-base quality plot, and will
  • appear in the text based output.
  • We now distinguish between Illumina

New in FastQC 0.8.0 (Jan 22, 2011)

  • Made all graphs easier to interpret
  • Added an option to analyse only mapped sequences from a BAM/SAM file
  • Added an option to analyse two or more files in parallel

New in FastQC 0.7.2 (Jan 14, 2011)

  • Fixed bug when analysing libraries with no unique sequences
  • Added an option to specify a custom contaminant list on the command line

New in FastQC 0.7.1 (Jan 14, 2011)

  • Improved the command line interface with proper options and error handling
  • Added an option to force the file format where guessing from the filename doesn't work

New in FastQC 0.7.0 (Jan 14, 2011)

  • Added a Kmer enrichment analysis to find non-aligned enriched sequences
  • Cleaned up axis labels on all graphs

New in FastQC 0.6.1 (Jan 14, 2011)

  • Fixed a bug which caused some sequences and qualities from BAM/SAM files to be reversed

New in FastQC 0.6.0 (Jan 14, 2011)

  • Sequences can now be read from SAM/BAM format files
  • Added smoother lines to the graphs

New in FastQC 0.5.1 (Jan 14, 2011)

  • Fixed a formatting bug in the text output
  • Fixed the %GC plot to work well with reads over 100bp
  • Improved the fitting of the modelled curve to the %GC plot
  • Added more illumina oligos to the contaminants file

New in FastQC 0.5.0 (Jan 14, 2011)

  • Improved the fitting of the normal distribution to %GC plot
  • Calculated the total duplicated sequence % in the duplicate sequence module
  • Added pass/fail/warn icons next to each section of the HTML report
  • Put Icons and Images into subfolders in the HTML report

New in FastQC 0.4.3 (Jan 14, 2011)

  • Fixed the reporting of sequence counts in the Basic Stats module
  • Added a warning before overwriting reports in the interactive application

New in FastQC 0.4.2 (Jan 14, 2011)

  • Fixed y-axis scale on per-base quality plot
  • Added fail / warn checks to modules which lacked them and improved existing checks
  • Added a modelled distribtion to the per-sequence GC plot
  • Scale the width of report graphs for long sequence reads

New in FastQC 0.4.1 (Jan 14, 2011)

  • Changed the duplicate module to reduce memory usage for long sequences
  • Changed the way duplicate levels are counted to be more realistic