CLC Genomics Workbench Changelog

What's new in CLC Genomics Workbench 6.5 Build 94986

Sep 30, 2013
  • New features:
  • Variant detection:
  • New tool for adjusting read mappings through local realignment. The Local Realignment tool has the option to realign unaligned ends, realignment with a guidance variant track (e.g. obtained from external resources such as dbSNP, through the Indels and Structural Variants tool described below or from analysis of other read mappings) and allows for realignment of multiple samples. Has previously been available as a beta plugin.
  • New tool for detecting structural variants (detects insertions and deletions, intra-chomosomal translocations, tandem duplications and inversions) working on "unaligned ends (soft clippings)". Has previously been available as a beta plugin.
  • Important changes to variant reporting: adjacent variants are now reported as one variant instead of linked variants.
  • A new variant filter has been added to both “Probabilistic Variant Detection” and “Quality-based Variant Detection”: “Ignore variants in non-specific regions”. This new filter ensures that variants in regions covered by just a few non-specific reads are ignored.
  • Probabilistic Variant Detection: A new threshold filter, “Required variant count”, has been added to the wizard. This filter ensures that only variants present in a number of reads that exceeds the specified threshold are called.
  • Quality-based Variant Detection: Addition of a new column that reports hyper-allelic status of variants. This is based on the specified threshold “Maximum expected allele” in the “Set genome information” wizard under “Ploidy”. The output in the table is “Yes” or “No” with respect to whether the threshold has been exceeded.
  • A new column has been added to the variant track table that describes the length of the insertions, deletions, and replacements. This makes it possible to filter on the length of e.g. insertions/deletions.
  • VCF export is now using genotype fields. The tag CLCAD is used for count of a variant, and PL is used for coverage. In this version, one variant track will result in one VCF file.
  • Variant annotation:
  • New tool for comparing variants between two samples
  • Filter against known variants and Annotate from known variants: An MNV in the input track can be annotated as a partial match of an SNV in the track of known variants, if the SNV is a subset of the MNV.
  • Filter against known variants: There is a new option to let MNVs be annotated as an exact match if several SNVs in the track of known variants can be joined to represent the full MNV allele sequence input track.
  • When running the “Annotate with overlap information” tool using an annotation track as input and a variant track as parameter track, the column describing the specific variant in the Track Table now shows the position and description of the variants. The variant description also appears in the track tooltips when holding the mouse over the variants.
  • Workflows:
  • Automatic adjustment of layout in workflows. It is now (again) possible to adjust the connected workflow elements automatically. Right click in the workflow editor to access a menu with the option "Layout". Clicking on "Layout" will adjust the layout of the workflow. The layout can also be adjusted with the quick command Shift + Alt + L.
  • Automatic update of tools in workflows. Tools in existing workflows will automatically be updated when opened from the Navigation Area. If new parameters have been added to the updated version of the tool, these will be used with their default settings. A workflow can be kept in its original form by saving the updated workflow with a new name as this will ensure that the old workflow is kept rather than being overwritten.
  • Information: In the “Manage Workflows” dialog a new tab has been added providing information about the workflow (such as when it was built and which version of the workbench was used).
  • Highlight used elements: In the Side Panel under “View mode” it is now possible to select “Highlight used elements”, which will show all elements that are used in the workflow. Unused elements are grayed out. The “Highlight used elements” can also be activated with the quick command Alt+ Shift + U.
  • Highlight Subsequent Path: Is a further development of the old option “Mark Subsequent Path”. If you right click on the name of one of the tools in a workflow, it is possible to select “Highlight Subsequent Path”, which will highlight the path in the workflow from the tool that was clicked on and further downstream. All other elements in the workflow will be grayed out.
  • Create Installer: A workflow can now be installed directly from the workbench. This can be done with the “Create Installer” button (or the quick command Alt + Shift + I). Three options exist in the “Create Installer” dialog: 1) Install the workflow on your local computer, 2) Install the workflow on the current server (requires that you are logged on to the server and that you are the administrator), and 3) Create an installer file to install it on another computer.
  • Export can now be part of workflows.
  • Workflow enabled elements can be dragged directly from the Toolbox into the workflow editor.
  • Workflow images can be copied from the editor by using Ctrl + C (copy) and pasted into the desired destination with the Ctrl + V command.
  • All elements can be removed from the workflow with the shortcut Alt + Shift +R.
  • Previously, when running the “ChIP-Seq Analysis” tool, the result would be a copy of the read mapping with annotations added. Now the annotations are added to the read mapping used as input. Workflows using the "ChIP-Seq Analysis" tool must be manually updated, deleting the ChIP-Seq workflow element and adding it again.
  • Reinstallation of modified workflows can now be done directly with the “Create Installer” function. A pop-up dialog provides the option to make "forced import" of an already installed workflow.
  • Speed improvements in the workflow editor means that the user experience when editing large workflows has been greatly improved.
  • New tools that are now workflow-enabled:
  • Classical Sequence Analysis, Alignments and Trees
  • Create Tree
  • Maximum Likelihood Phylogeny
  • Classical Sequence Analysis, General Sequence Analysis
  • Extract Annotations
  • Classical Sequence Analysis, Nucleotide Analysis
  • Reverse Complement Sequence
  • Reverse Sequence
  • Molecular Biology tools, Sequencing Data Analysis
  • Assemble Sequences
  • Assemble Sequences to Reference
  • Secondary Peak Calling
  • Track Tools, Annotate and Filter
  • Extract Reads Based on Overlap
  • Track Tools, Graphs
  • Identify Graph Threshold Areas
  • Resequencing Analysis, Compare Variants
  • Compare Sample Variant Tracks
  • Transcriptomics Analysis, General Plots
  • Create Histogram
  • De Novo Sequencing
  • Map Reads to Contigs
  • 3D Molecule Viewing: The integrated viewer of structure files, the 3D Molecule Viewer, has been completely redesigned. The Molecule Viewer offers a range of tools for inspection and visualization of proteins and other molecules stored in structure files from the Protein Data Bank (PDB).
  • De novo assembly:
  • New tool: Map Reads to Contigs. This tool allows mapping of reads to contigs. This can be relevant in situations where contigs have been imported from an external source, the output from a de novo assembly is contigs with no read mapping, or if you wish to map a new set of reads or a subset of reads to the contigs.
  • Scaffolds can be exported in AGP format: scaffolded contigs are exported as individual contigs and not as a single scaffold with N's inserted in between contigs. This allows for submission-ready data.
  • Great performance improvement when updating the contig sequence based on reads that are mapped back to contigs.
  • Tracks: Several new features have been added. Tt is now possible to:
  • When there are more reads than can be shown in the available view area, an overflow graph will be displayed below the reads. The overflow graph was previously shown in grey. Now the overflow graph is shown in the same colors as the sequences. Hence, it is now possible to distinguish forward, reverse and paired reads in the overflow graph as well as mismatches in reads.
  • Insertions from variant tracks and reads tracks can now be shown in tracks.
  • For variant tracks, a new side-panel option “Insertion” allows the user to select whether to display insertions or not.
  • For reads tracks insertions seen in more than a given percentage of reads are shown. The default percentage is 1%, setting it to 0% will show all insertions (like the cluster editor) and setting it to 100% will show no insertions.
  • Insertions in reads tracks that are present at a frequency below the specified threshold are shown with a vertical line in the reads to indicate its location.
  • Reads tracks now also have a mouse-over tooltip that provides information about insertions at specific positions. This tooltip reports the number of reads that contain the insertion and lists what was inserted.
  • Extract reads from read tracks in two different ways:
  • Extract sequences from tracks. Allows extraction of all reads as single sequences or as sequence lists.
  • Extract from selection. Allows the creation of a reads track containing only reads that fall within the selected region, and of specific types.
  • Four new options are available in the Side Panel for Track layout when viewing a reads track:
  • Show quality scores: Makes it possible to adjust the colors of the residues based on their quality scores. In cases where no quality scores are available, blue (the color normally used for residues with a low quality score) is used as default color for such residues.
  • Matching residues as dots: Replaces matching residues with dots in reads tracks. This option makes it easier to spot variants.
  • Show read type specific coverage: When enabled, the coverage graph that summarizes those reads that could not be explicitly shown is now replaced by one coverage graph for each read type found in the Reads track. This can be used for easy and visual comparison of the strand specific coverage.
  • Only show coverage graph: When enabled, only the coverage graph is shown and no reads are shown.
  • A new tool has been included: “Identify Graph Threshold Areas”. This tool uses graph tracks as input to identify graph regions that fall within certain limits (thresholds that have been specified by the user).
  • Extract annotations from track. This tool makes it very easy to extract parts of a sequence (or several sequences) based on its annotations.
  • Create a track list with the shortcut Ctrl + L
  • The create histogram tool now also accepts graph tracks as input.
  • The error message "Too much data for rendering. Either zoom in to view data, or adjust data aggregation threshold" has now been added to the big grey box that appears in cases where a track cannot be shown. Previously only a big grey box was shown with no further explanation.
  • Opening a large table view of a variant track is no longer blocking the user interface. It is running in the background, and it is possible to stop loading the data by closing the table view.
  • The Coverage analysis tool is a new tool that can find regions in a read mapping where the coverage is suddenly dropping or rising.
  • The "Assemble Sequences" and "Assemble Sequences to Reference" tools are now batch, server and workflow enabled.
  • Assemble Sequences: Trimming is no longer integrated with the “Assemble Sequences” tool. This means that trimming must be done separately with the “Trim Sequences” tool.
  • Export framework redesigned:
  • Export of multiple files: you can export several files in one go. The naming of the file will default to the name used in the Navigation Area of the Workbench, but the user can specify a naming pattern to use instead.
  • Export formats: A new column “Exports selected” has been added to the “Select exporter” table that indicates with a “Yes”, “No” or “Partly” whether the currently selected element can be exported with the given exporters. Partly means that you have made a selection of elements where only some of them can be exported by the selected exporter.
  • Improved usability with a quick-select dialog for choosing the right export format. The dialog includes a description of each exporter that can be quickly filtered.
  • Export can be integrated into workflows
  • Support for direct compression of exported files in zip and gzip.
  • Previously, VCF export required the user to know that both a variant track and a sequence track should be selected before exporting. This has changed, so that the user only has to select the variant track as input, and the sequence track is supplied as a parameter. This means it is more obvious that it should be selected, and it also means that the choice of sequence track will be remembered for the next vcf export.
  • The folder viewhas been improved with the following:
  • It is now possible to drag and drop objects from the folder editor. This will create a copy of the objects at the selected destination.
  • Attribute columns will be left empty if the attribute has not been defined (previously attribute values that had not been defined were set to 0 and checkboxes were shown as unchecked).
  • A new column showing the first 50 residues of each sequence has been added as an option.
  • The column with the name “Length” has been renamed to “Size”.
  • The column “Size” shows the length of a sequence, or for sequence lists, the number of sequences e.g.:
  • Sequence or contig lists: the number of sequences/contigs
  • Read mappings: the length of the consensus sequence
  • De novo assemblies: the length of the reference
  • Alignments: the length of the alignment
  • The Side Panel “Save/Restore Settings” function has been expanded with a new feature:
  • The “Save/Restore Settings” function (found at the top of the Side Panel) has been redesigned. It is now possible to save settings in two different ways. 1) The settings can be saved for this element type in general, e.g. for a track it would be save settings “For Track View in General”. In this way the settings will be applied each time you open an element of the same type, which in this case means each time one of the saved tracks are opened from the Navigation Area, these settings will be applied. These “general” settings are user specific and will not be saved with or exported with the element. 2) Settings can be saved with the specific element only e.g. for a track it would be save settings “On This Track Only”. The settings are saved with only this element (and will be exported with the element if you later select to export the element to another destination).
  • Alignments: If you have one particular sequence that you would like to use as a reference sequence, it can be useful to move this to the top. This can now be done automatically by right clicking on the sequence name and then selecting “Move Sequence to Top”.
  • The Sequence List Table has been improved with a new feature. A new column showing the first 50 residues of each sequence has been added as an option.
  • SOLiD import now accepts XSQ files
  • The following Plug-ins are now fully integrated in the Workbench:
  • InDels and Structural Variation (old plugin name: "Structural Variation")
  • Local Realignment
  • Extract Annotations
  • The tomato genome, Solanum lycopersicum SL2.40.18, available in the Download Genome tool.
  • Phylogenetic trees:
  • Create Tree now support the Kimura 2-parameter substitution model for DNA sequences and Kimura's distance estimate for protein sequences (Kimura 1983).
  • It is now possible to construct Maximum Likelihood phylogenies from protein sequences.
  • Improvements:
  • Scrolling in read mappings: The mouse scroll speed through read mappings can now be performed with increased speed. Shift + Alt + Mouse wheel scroll makes the scroll 10x as fast as when using Alt + Mouse wheel scroll. When zoomed all the way in, each mouse wheel step scrolls 10 rows.
  • Sort Sequences by Name: The multiplexing tool now allows a delimiter between group names in the “Sort Sequences by Name” wizard. This means that the selected group names are separated by an underscore. Previously all selected group names were merged without any delimiter.
  • Cloning: The cloning editor can now work without having a designated vector. In essence this means that when no vector is selected you go directly in “Stitch mode” when a fragment has been selected, whereas you go in “Cloning mode” when a cloning vector and a fragment are selected.
  • Renaming of data in the Navigation Area by clicking twice has been improved. Previously, you could accidentally enter rename mode when the intention was to get focus in the Navigation Area. Now, you can only trigger rename by clicking when the Navigator has focus.
  • Filter Annotations on Name: The wizard layouts for the tool when used directly as opposed to when included in a workflow has been standardized.
  • Extract consensus sequence tool:
  • It is now possible to use the quality scores when resolving conflicts or disagreements between reads with “Insert ambiguity codes”. Previously, “Use quality scores” could only be selected when using the “Vote” option for conflict resolution.
  • Low coverage regions are now annotated in the consensus sequence produced.
  • The Extract Consensus Sequence dialog is now shown when extracting the consensus sequence when right-clicking a selection on the reference sequence in the mapping view, enabling the user to extract part of the consensus sequence.
  • The Extract Consensus Sequence dialog is now shown when extracting the consensus sequence when right-clicking the name of the consensus or reference sequence, or when clicking the Extract Consensus button in the mapping table. The right-click menu option on the consensus to Open Sequence Including Gaps has been removed, since this functionality is now covered by the Extract Consensus Sequence tool.
  • When using the “Translate to protein” tool, the max limit has been raised to 1GB.
  • The sequence action "Open Copy" has been removed and "Open This Sequence" has been renamed to "Open Sequence".
  • The alignment tool is now more memory efficient.
  • Tables: Improved auto-adjustment of the column width (based on content and number of columns).
  • Read mapping: The speed of running a read mapping against a masked reference has been improved significantly. When mapping reads to a reference sequence, it is possible to map reads to only selected annotated regions of the reference (= masking). Previously masking of a reference was performed by replacing the masked out nucleotides with N's. The new masking method discards the masked out nucleotides by splitting the reference into separate sequences. Hence, the masked out sequences are completely ignored in the analysis. The remaining sequence fragments are positioned according to the original unmasked reference sequence.
  • Read mapping: The status bar in the lower right corner now shows the corresponding positions on the reference/contig sequence.
  • The read mapper will now place ambiguous gaps to the left, as opposed to the right, to ensure better concordance with common variant databases.
  • BLAST has been upgraded to BLAST+ 2.2.28 that includes a number of improvements and bug fixes. A full list of BLAST+ 2.2.28 changes can be viewed at http://www.ncbi.nlm.nih.gov/books/NBK131777.
  • Usability improvement of simple table filtering:
  • A dedicated filter button has been added to apply the filter directly without having to wait until the filter is automatically applied
  • For tables with more than 10000 rows, the filter will not be applied automatically after a delay. Instead, there is a helping text asking the user to apply the filter through the "Filter" button. This avoids premature filtering before entry of the filter text has completed. Since filter can take some seconds for large tables, this used to be an annoyance because the user would have to wait until filtering was done to complete the entry.
  • Phylogenetic trees:
  • Bootstrapping with the "Maximum Likelihood Phylogeny" is now possible.
  • Bootstrap values are now displayed in percent instead of absolute numbers.
  • Bug fixes:
  • Numbering of amino acids when calculating amino acid changes was wrong for coding regions spanning the starting point of circular chromosomes. We recommend running amino acid calculation again. Please note that the actual amino acid change is called correctly, only the numbering is affected.
  • PDF export of the history of a result did not include the name and version number of the Workbench that produced the result.
  • Phylogenetic trees:
  • The Juke-Cantor distance estimate now ignore all positions containing gaps in pairwise alignments.
  • Disabled substitution rate estimation when the corresponding option is deselected by the user in the Maximum Likelihood Phylogeny tool.
  • Fixed a bug that caused branch lengths to be estimated incorrectly for ML trees.

New in CLC Genomics Workbench 5.5.1 (Sep 12, 2012)

  • Improvements:
  • Improved accuracy of read mapping
  • Bug fixes:
  • Important: In Genomics Workbench 5.5, the Process Tagged Sequences tool would sometimes switch the sample names of the results. We strongly recommend everybody to update to the new version, and re-run all analyses made with this tool in Genomics Workbench 5.5.
  • Fixed: Various read mapper bug-fixes that made the read mapper crash on certain data sets
  • Fixed: Workflows would fail when intermediate results were empty (e.g. if no variants were found and a variant track was used for subsequent analysis).
  • Fixed: Consensus generation when creating standard read mappings was slow in Genomics Workbench 5.5
  • Fixed: Some IonTorrent sff files would fail to import on Windows.
  • Various bug fixes

New in CLC Genomics Workbench 5.5 (Aug 17, 2012)

  • New features:
  • Re-sequencing tools:
  • New variant caller: Probabilistic variant detection.
  • This is based on a probabilistic model in contrast to the quality-based variant caller that is based on quality analysis and cut-offs.
  • Supports genomes with a ploidy of 1, 2, 3 or 4.
  • Pre-filtering for non-specific matches and intact pairs
  • Post-filtering of homopolymer regions and forward/reverse reads balance
  • The current SNP and DIP detection tools are merged into one: Quality-based Variant Detection.
  • Pre-filtering for non-specific matches and intact pairs
  • Post-filtering of homopolymer regions and forward/reverse reads balance
  • Target regions statistics(previously a plug-in) is now integrated into the
  • Workbench:
  • A new parameter: Minimum coverage that will report the fraction of each region that is covered by at least this number of reads
  • Works on tracks: the regions of interest are defined in a track and the resulting per-region table is reported as a track
  • Annotation and filtering tools for variants
  • Annotate and filter against database variants (dbSNP, 1000 genomes or other databases that can be downloaded or imported)
  • Filtering of marginal variant calls based on average base quality, forward/reverse reads balance and frequency
  • Annotating variants with exon numbers
  • Variant comparison:
  • Compare variants within group: Find variants that are shared between a number of samples
  • Fisher exact test: Compare variants between case and control groups to find variants that are more common in the case than in the control
  • Trio analysis: Compare child-father-mother variants to enable studies of inherited and de novo mutations
  • Filter against control reads: Compare a variant track against a control sample to remove variants that are also present in the control
  • Filter on haplotype comparison: Identifies variants that have the same haplotype in two samples.
  • Functional consequences of variants:
  • GO enrichment analysis.This tool can be used to investigate the effect of candidate variants by analyzing the affected genes for a common functional role.
  • Amino acid changes: Classify synonymous and non-synonymous variants and see the effect on the protein.
  • Annotate with conservation scores: Annotate a variant with a score from conservation tracks that can be imported into the Workbench.
  • Predict splice site effect: A simple investigation to see if the variant is within two bases of an intron-exon boundary
  • Download of reference genome and annotations
  • Integrated download of reference genome sequences and annotations for selected organisms
  • Example: for human hg19, you can directly download sequences, genes and transcripts, variants from 1000 genomes, Hapmap, COSMIC, and dbSNP (incl. common SNPs).
  • Tracks:
  • Genomic information for re-sequencing analysis can now be stored as tracks.
  • Great power for comparison and visualization because different kinds of data (reads, variants, genes etc) are not bundled into one static file but are separated into one file per data type. This means that different data sources can be compared and visualized in a flexible way.
  • Track lists provide a mechanisms for combining several de-coupled tracks into one list for visualization purposes while retaining the individual files that contain the data
  • All tools for re-sequencing has options to create and use tracks (e.g. read mapping, variant detection etc). More tools will be re-designed to work with tracks later.
  • Tools for converting between standard sequences and mappings and tracks:
  • Convert tracks to sequences, mappings etc
  • Convert sequences, mappings and annotations to tracks
  • Tools for filtering, annotating and merging tracks
  • Support for importing files as tracksfrom a number of new formats:
  • Fasta
  • VCF
  • BED
  • Wiggle
  • UCSC table format
  • GFF / GTF and GVF
  • Complete genomics master var files
  • Workflow:
  • Workflows can be built in the Workbench to combine various tools from the Toolbox into one analysis, connecting the output from one tool to the input from another
  • Workflows can be distributed and installed either in the Workbench or in the CLC Genomics Server
  • The creator of the workflow can configure parameters for the workflow and these will be fixed when the workflow is distributed and installed
  • The creator of the workflow decides which of the output from the tools that should be saved and which should be discarded
  • Workflows can be run in batch, making it a powerful tool for crunching high numbers of samples through the same pipeline.
  • New read mapper:
  • Great improvement of speed for mapping (white paper to be released soon)
  • Support for complex genomes with many repeats
  • Re-design of wizard for read mapping to make it simpler and easier to use. Options to control consensus sequence building and annotating with conflict annotations have been removed, since they have very little relevance for the amounts of data created by NGS platforms today
  • Color space mapping is still performed with the old mapper
  • Automatic calculation of paired distance (only for base space data)
  • Report includes percentage of reads instead of only counts
  • Changed strategy for placement of gaps: previous versions tried to cluster gaps into as few units as possible. This would sometimes cause problems for variant calling because this would in some situations place the gaps differently from read to read.
  • Please note that the memory requirements are different than for the old mapper. The memory requirements depend largely on the size of the reference genome. We will soon update our system requirements page to reflect this.
  • Sequencing QC report: Create summary statistics for sequencing data in various ways:
  • General statistics on read length etc
  • Quality statistics on quality scores
  • Over-representation analysis of subsequences
  • Analysis of duplicated reads
  • Re-organization of menus in general to be more genomics focused:
  • Classical sequencing tools organized into a subfolder (for gene and protein analysis, alignments and trees etc)
  • Molecular biology tools like cloning, PCR primer design, Sanger sequencing analysis etc moved to a special folder
  • Two new folders for core NGS and core track tools
  • Application-specific folders for the various types of NGS applications: resequencing, de novo sequencing, transcriptomics and epigenomics
  • Search tools moved to the Download menu (available from the top menu and the Tool bar)
  • Different importers integrated into one menu, including the new track import. The Vector NTI import has been moved into a plug-in that can easily be installed from the plug-in manager.
  • The Local Search has been moved from the Search menu (now renamed to Download) and into the Edit menu
  • Improvements:
  • It is possible to create sequence lists based on other sequence lists (not only single sequences)
  • Listing folder elements in the Navigation Area is faster
  • Translate to protein creates sequence lists when there are more than ten sequences
  • SAM import and export supports version 1.4
  • Pearson coefficient displayed in scatter plots
  • ChIP-Seq table splits nearest genes into two columns so that it can be used for filtering
  • Illumina import: check-box to control whether reads from Illumina pipeline 1.5-1.7 should be trimmed
  • A new translation table added: Pterobranchia mitochondrial (see http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi#SG24)
  • User manual has been re-structured with a part IV about high-throughput sequencing and integration of import and download sections into part III.
  • Manually updating paired information for sequence list is performed through a dialog, eliminating problems about saving the updates correctly.
  • Find Low Coverage tool in mapping editor is now faster and operates as a background job
  • CLC URL handling in Workbench: In the installer, you can choose to use the Workbench to open files from CLC URLs (clc://). This can be used when working on a CLC Genomics Server, for example, to provide a link to a dataset that can then be opened through the Workbench.
  • Bug fixes:
  • Various bug fixes.
  • Special notes for customers already using the Genomics Gateway plug-in:
  • New search tool in track list editor
  • New navigation and position panel at the top of the Side Panel in the track editor
  • Download tool for downloading genomic data replaces Ensembl download tool
  • Unlimited number of chromosomes in tracks
  • More streamlined conversion tools:
  • Convert tracks to sequences, mappings etc
  • Convert sequences, mappings and annotations to tracks
  • Export tracks to gff, sam
  • Print and graphics export of tracks
  • New tool for filtering marginal variant calls
  • New tool for annotating against database variants
  • Plug-in updates:
  • Genomics Gateway plug-in is integrated into the standard Genomics Workbench and Server.
  • Probabilistic variant detection Plug-in is integrated into the standard Genomics Workbench and Server.
  • Sequencing QC plug-in is integrated into the standard Genomics Workbench and Server.
  • Target regions statistics plug-in is integrated into the standard Genomics Workbench and Server.
  • Grid integration plug-in is integrated into the general server plug-in. If a grid preset is present on the server, the Grid option becomes available in the Workbench dialog.
  • Old read mapper made available as a legacy plug-in that customers can download. This facilitates compatibility of results with previous versions and can be used when memory requirements for the new mapper are too large.
  • Beta read mapper is integrated into the standard Genomics Workbench and Server.
  • Biobase genome Trax is redesigned and split into two:
  • For downloading data (requires a download license)
  • For annotating a variant track (requires an online license)

New in CLC Genomics Workbench 5.1.5 (Jul 26, 2012)

  • Bug fixes:
  • Fixed: Problem with online BLAST at NCBI

New in CLC Genomics Workbench 5.1 (Jul 26, 2012)

  • Improvements:
  • Ion Torrent paired protocols are now supported for both fastq and sff files. Read more...
  • MiSeq multiplexed data directly supported. This means that the barcoded samples are recognized on import and the reads are grouped accordingly. The reads from the same sample will be grouped in its own sequence list. Read more...
  • New broken pair mate locater tool for getting overview of where the mates of broken pairs in a selected region are mapped. It includes the possibility to extract a sequence list with the broken pairs. Read more...
  • Aligned fasta import and export is now supported (see http://www.bioperl.org/wiki/FASTA_multiple_alignment_format). A consequence of this is that the standard fasta import of sequences will reject to import sequences that contain gaps, assuming they should be imported as alignments instead.
  • User manual includes a section on which tools will be benefit from computers with multiple cores.
  • The license order ID is visible in the License Manager, both for static and network licenses. For security reasons, the last 10 characters of the ID are masked. This will prevent unauthorized persons from copying the license order ID to another computer, but will allow the CLC staff to identify the license used.
  • Bug fixes:
  • Fixed: ChIP-Seq Analysis would sometimes yield no results when the FDR could not be estimated. This error was introduced with Genomics Workbench 5.0.1. If you have had ChIP-Seq samples were no peaks were reported, we recommend re-running the analysis with the new version.
  • Fixed: Cloning bug: when performing restriction cloning in regions with single-stranded DNA, you would get an error.
  • Fixed: 454 paired data import: quality scores on the second part of the read were not imported.

New in CLC Genomics Workbench 5.0.1 (Jul 26, 2012)

  • Plug-in updates:
  • Probabilistic Variant Detection Plug-in updated
  • There is a new filter that requires sequencing reads from both strands to call a variant
  • The forward and reverse coverage for each allele is reported in the output
  • Minor improvements:
  • Small RNA tools: the download and annotation tools now support recent changes to miRBase where mature and mature* nomenclature has been replaced with 3' and 5' mature regions.
  • It is now possible to specify the number formatting in tables in the View Preferences.
  • Bug fixes:
  • Fixed: Downloading of protein sequences from NCBI fails.
  • Fixed: Calculation of cDNA-level changes in variant detection fails in some situations.
  • Fixed: Trimming tool in Sequencing Data Analysis (not for High-throughput Sequencing data) does not add annotation to sequences when the full sequence should be discarded.
  • Fixed: Opening external files (e.g. pdf files or Word documents) with spaces in the file name does not work on Windows.

New in CLC Genomics Workbench 5.0 (Jul 26, 2012)

  • New plug-ins and plug-in updates'
  • Genomics Gateway plug-in updated
  • New tools for analyzing variants in groups of samples, enabling systematic analysis of genetic variants for whole genome, exome or targeted approaches.
  • Find Common Variations in Group. This can be used to find common variants in a group of variant tracks.
  • Fisher Exact Test. Comparing two groups of variant tracks (e.g. can be used for case-control studies). You can see which variants are found more common in the case compared to the control group using the Fisher Exact test.
  • Filter against Control Reads. This can be used to compare a single case variant track against a negative control from the same sample. It will check whether a certain number of the reads in the control sample have the same allele present as in the case variant.
  • New tools for functional annotation of variants
  • GO Enrichment Analysis for identifying significant gene ontology terms, which are annotated to genes having at least one variation.
  • Annotation with Conservation Scores. By importing a conservation score track (e.g. PhyloP Scores), variants can be annotated with a conservation score. Variants with a high score are assumed to alter functionally important regions.
  • New data structure.
  • All tracks are now saved as single files, and you can create a Track List to visualize them together.
  • A tool is available for data conversion from track sets to single tracks
  • New organization of the "Tool box" to provide a better overview
  • Support for batching and running tools on a Genomics Server
  • The Track List view supports drag and drop for adding and re-arranging tracks
  • Several Graph tracks can be created and displayed
  • Read the updated manual here.
  • Probabilistic Variant Detection Plug-in updated
  • The probability used as threshold for the algorithm is now reported in the output
  • Variants reported cDNA-level numbering and variant information compatible with www.hgvs.org/
  • Additional Alignments Plug-in updated
  • The algorithms have been updated to the most recent versions
  • The list of algorithms has been reduced to two for compatibility reasons
  • New and improved features:
  • New de novo assembler.
  • Scaffolding is integrated into the assembly. This means better resolution of contigs and insertion of Ns when two contigs cannot be joined in sequence but there is pair information that connects them.
  • New extended report for the assembly with information about nucleotide distribution, contig lengths measurements and scaffolding regions.
  • User interface improvements: Wizard re-designed to better reflect the process of the assembly. The parameters related to the mapping step are only available when the user chooses to map the reads back to the contigs.
  • New parameter for specifying the maximum bubble size. There is a default value which is automatically calculated based on the input data.
  • New white paper with benchmarks and results from quality control.
  • The old de novo assembler is available as a plug-in. At the end of 2012, the plug-in will be discontinued, so it should only be used for backwards compatibility with results from older runs or if the new assembler fails.
  • Printing and pdf export of read mappings: the mappings are now wrapped to make better use of the paper.
  • SNP and DIP detection results include cDNA-level numbering and variant information compatible with www.hgvs.org/
  • SAM files exported from the Workbench now include basic information about read groups. Furthermore, read orientation for paired reads is now preserved when exporting to SAM and BAM files.
  • Improved exploitation of multi-core machines in read-mapping, RNA-Seq, and de-novo assembly.
  • Improved performance and memory management for high-throughput analyses in general.
  • Usability of Close icon on tabs has been improved. Both in terms of responsiveness and making it impossible by accident to initiate a drag and drop movement when you hit the close icon to try to close a tab.
  • "Show" submenu has been removed from File Menu, and the right-click menu now includes only the relevant views and editors. This provides a better overview.
  • The behavior of the Close Other Tabs function has changed so that it will close all views, regardless of the way the view area is split.
  • The most common annotation types are assigned a special color per default. Other annotation types previously got the same color. This has been extended so that the Workbench attempts to find a special color for each type.
  • VectorNTI import is no longer in a separate plug-in but part of the Workbench. The functionality remains unchanged.

New in CLC Genomics Workbench 4.9 (Jul 26, 2012)

  • New plug-ins and plug-in updates:
  • New plug-in released: Ab Initio Transcript Discovery
  • Brand new tool for transcript discovery. Based on gapped alignments of RNA-Seq data, the plug-in identifies new transcripts and creates or extends annotations on the reference sequence that can be used for measuring gene expression using the RNA-Seq Analysis tool of the Genomics Workbench. The plug-in provides functionality a la Cufflinks/TopHat. Note that this used to be called the Large Gap Mapper plug-in.
  • Genomics Gateway plug-in updated
  • New refiner: variant frequency. This allows you to filter a variation track, so that only the variants that have a frequency above a user-defined threshold remain. Note that the filter only applies to the frequency of non-reference alleles.
  • Performance improvements when visualizing read tracks
  • Fixed: CDS annotations from Ensembl did not include start codons
  • Fixed: Some variation tracks were not always recognized as variations. This means that the variation-specific refiners could not be used.
  • Fixed: Table view of annotation tracks could have a very large number of columns that are now combined into one column.
  • Fixed: There was an error when closing a view without saving changes. This could lead to subsequent errors when trying to rename tracks.
  • MLST module updated
  • Possible to download MLST schemes from any web site compatible with mlstDBnet
  • When a new allele is called because the sequencing reads are not long enough, this is reported in the isolate view rather than "New allele"
  • Structural variation plug-in updated
  • Only detection of insertions, deletions and interchromosomal variations are now supported.
  • The plug-in has a problem with repeats. The best way to work around this is to ignore non-specific matches when doing the mapping, to run the structural variant detection with a very stringent p-value cutoff and filter repeats out afterwards if possible (this could be by refinement with the microsatallite track from Biobase or another repeat track using the Genomics Gateway).
  • Integration of exporter to export results in circos format.
  • New and improved features:
  • Process tagged sequences
  • A summary report is now available with an overview of the number of reads per bar code.
  • You can search for barcodes (MIDs) on both strands, supporting new 454 protocol.
  • Core management: you can restrict the maximum number of cores that the Workbench is allowed to use. This can be useful when the Workbench is running on a system with shared resources where other applications need reliable access to CPU when the Workbench is doing analyses. This is mainly an issue for the De novo assembly and Read Mapping algorithms but the restriction applies to all algorithms that use several cores.
  • Multi-site Gateway Cloning. You can perform multi-site gateway cloning and in a few clicks create your expression clones with multiple fragments. The existing Gateway Cloning tool has been expanded so that you can easily recombine several fragments as well as continue using it for the standard Gateway Cloning.
  • Find Binding Sites and Create Fragments improved:
  • If your template sequence contains ambiguity nucleotides (like N, Y etc), these will no longer count as mismatches when checking your primers. Note that the primer base of course need to be covered by the ambiguity symbol (e.g. a T would still be a mismatch if the template sequence has an R, which means either A or G).
  • Fixed: When using multiple template sequences, the choices to open or annotate a fragment from the fragment table did not work properly. They always applied to the first sequence although the fragment was located on another sequence (as indicated in the table).
  • Exporting fastq format no longer includes redundant name of the read in the quality score line. Now the name only appears once per read.
  • Enhancing the nomenclature of reporting amino acid changes in variant detection:
  • p. prefix included
  • ? used for unknown (rather than non-standard "Unknown")
  • = used to denote an allele which agrees with the reference sequence (rather than missing entries or entries like Ala45Ala)
  • [...] used around ,-separated lists of changes, each change coming from a different CDS annotation
  • [...];[...] scheme used to separate multiple alleles at same site
  • Bug fixes:
  • Fixed: Import of SOLiD data failed when multiple sets of paired data was selected.
  • Fixed: Annotations spanning the sequence from start to end did not display right when the sequence was wrapped. The annotation was only displayed on the first line.
  • Fixed: Set-up experiment would crash when using many samples.
  • Fixed: Calculation of consensus sequence in read mappings: Sometimes a majority of gaps would be ignored and a base erroneously introduced in the consensus sequence. It occurs when 1) there is no coverage in an initial segment of the reference sequence, and 2) a gap is encountered in the global read alignment. From that point onwards, gap counts are included in the consensus vote, but they are taken from the start of the mapping (where they are all 0), so they are out of sync with associated base counts. High gap counts would then kick in further downstream, possibly making the consensus a gap where it should not be. We recommend checking your mapping results manually if you rely on using the consensus sequence for further analysis.
  • Fixed: importing adapters for trimming and barcodes for de-multiplexing did not work properly for CSV files and empty rows in Excel files were not allowed.
  • Fixed: Motif search did not exclude regions with Ns when the option "Exclude matches in N-regions for simple motifs" was selected.

New in CLC Genomics Workbench 4.8 (Jul 26, 2012)

  • New plug-ins and plug-in updates:
  • New scaffolding de novo assembler released as a beta plug-in.
  • New read mapper released as a beta plug-in. First version without color space support.
  • New probabilistic variant detection released as a beta plug-in.
  • Genomics Gateway beta plug-in updated:
  • Direct download of annotations from Ensembl through the Workbench.
  • Support for importing zipped data
  • Multiple files can be imported in one go
  • Conservation track from UCSC can now be imported
  • Common SNP track from UCSC can now be imported
  • Tools to merge and copy tracks
  • New refiner to extract a subset of genes from a gene track (look for the Name filter refiner)
  • SpliceSite refiner to annotate variations that affect exon/intron boundaries
  • Various bugs fixed
  • Check the updated manual
  • Structural variation beta plug-in released in version 2:
  • Now support for inter-chromosomal structural variations
  • Works with mappings created using the Large Gap Mapper beta plug-in
  • Check the updated manual
  • New and improved features:
  • De novo assembly improvements:
  • Word size can now be manually adjusted
  • When update contigs is not selected, the resulting mapping table will also include contigs where no reads map back. This means that the number of rows in the table will be identical to the number of "Simple contigs" produced by the de novo assembler. Previously contigs with less than two reads mapped back would be omitted from the table.
  • Merge Mapping Results will produce a mapping table when mapping tables are provided as input
  • New button to extract a subset of mappings from a mapping table
  • Mapping tables now include a row for reference sequences where no reads map. This is done to provide consistency of results. Opening such an entry in the table will just open the reference sequence in the table.
  • You can switch between compactness levels by pressing the Alt key while scrolling with your mouse or touchpad.
  • SNP detection no longer ignores ambiguity bases in the reads. Each ambiguity code is treated as a separate variant; no merging of the possible variants covered by each ambiguity code is attempted (this typically only has an effect when using Sanger sequencing data since standard NGS platforms do not use ambiguity base calls).
  • Translation in the Side Panel of nucleotide sequences now includes options to translate "All forward" or "All reverse" reading frames.
  • Conflict table view of read mappings: reference positions also reported in addition to the consensus sequence position.
  • Alignments: it is now possible to copy all annotations from one sequence to other sequences in the alignment.
  • Cloning editor: number of restriction cut sites and motifs are shown separately for the sequence currently displayed and for all sequences in the list.
  • Restriction enzymes updated with latest REBASE version.
  • Clean-up of the Workbench window so that it no longer holds information about which Workspace is active. This information is now displayed with check boxes in the Workspace menu.
  • SAM import and export format is now described in detail in the user manual.
  • Bug fixes:
  • Fixed: Orientation of SOLiD mate-pair data was not set correctly on import. This meant that the reads were marked as broken pairs after mapping. We strongly recommend all users to re-run the import if using SOLiD mate pair data.
  • Fixed: Virtual tag lists created with RNA failed
  • Fixed: For circular molecules, the Find Open Reading Frames tool did not find reading frames on the negative strand. We recommend users to rerun any reading frame analyses on circular molecules.
  • Fixed: Experiments tables can now be exported in Excel and csv formats
  • Fixed: BLAST searches at NCBI always searched nr or nt, regardless of which database was specified. This has been a problem since the release of CLC Genomics Workbench 4.7
  • Fixed: If a combination of trim options is used, like quality trim or length trim in addition to adapter trimming on both strands, the reads could end up reverse complemented.
  • Fixed: Import of paired data generated by Illumina Casava 1.8 did not match the pairs correctly. Users are advised to re-import and re-analyze all data imported from Casava 1.8.
  • Fixed: Pattern discovery wizard failed when the tool is run for the second time.
  • Fixed: De novo assembly sometimes failed on Mac OS 10.7 Lion.
  • Fixed: Errors for read mappings with the text "premature end of .cas file" have now been fixed. This has only been a problem on Windows.
  • Fixed: Certain annotation types were mapped to generic annotation types when exporting sequences in Genbank format.

New in CLC Genomics Workbench 4.7.2 (Jul 26, 2012)

  • Bug fixes:
  • Fixed: A cache-related bug which would sometimes result in errors when running large jobs.
  • Fixed: The UniProt search has been updated to reflect URL-changes at uniprot.org.
  • Fixed: A problem with interpretation of broken pairs on re-import from SAM format files.
  • Fixed: A problem with microarray experiments where large experiments could not be analyzed.

New in CLC Genomics Workbench 4.7.1 (Jul 26, 2012)

  • Bug fixes:
  • De novo assembly produced empty results
  • Paired distances for read mapping were not recorded correctly in history
  • Read mapping in batch: the minimum and maximum paired distance fields were enabled even though the "Override" checkbox was unchecked
  • Improved performance of packed view rendering
  • Various minor bug-fixes