The CQLS provides bioinformatics research and analysis consulting. Custom bioinformatics can be one-on-one training, experimental design discussion, software assistance, custom programming, or the whole analysis from start to finish. The CQLS bioinformatics staff has expertise in a wide range of bioinformatics specializations. Contact us and we can put you in contact with the right bioinformatician for your research needs.


Annotation, or the prediction of genes that encode proteins (CDS) and other genomic features such as signal peptides, transmembrane regions, ribosomal binding sites, transfer RNAs and conserved motifs/domains, is not easily automated and still needs to be customized based on the organism of interest and available databases. The bioinformatics group can assist in annotation by setting up and running database searches on the command line against the most common resources (NCBI, Pfam, RefSeq, SMART, PRINTS, COGS, Blast2GO, KEGG, etc. - many of these databases are hosted on our local servers for more efficient searching and higher throughput) and collect them into user-friendly tools such as SQLite and Microsoft Excel for downstream analysis. We can also assist in the development of training sets for gene predictions using GeneMark, GeneMaker, FGENESH, Glimmer, tRNAscan-SE, SNAP and Augustus. For unique or previously undescribed features, we can build custom tools for annotation efforts.


Bioinformatics for non-model species often requires de-novo sequencing, either from genomic DNA or transcriptomic cDNA. We provide services and one-on-one guidance to help the community navigate this incredibly broad space, with introductions to genome assemblers like Velvet, SOAPDenovo, CLC-Bio, and others as they are developed. Similarly, we have experience with DeNovo transcriptome assemblers such as Trinity, Oases, SOAPdenovo-trans, and Newbler (for 454 and similar data).

Command-Line Unix/Linux

The CGRB infrastructure is heavily based on the Linux operating system and is accessed primarily via remote login on the command-line. Computational jobs are run using the Sun Grid Engine/SGE system.  We can help train users to work with these systems through classes and workshops and one-on-one training. We can assist if you need help accessing your files and navigating the file system, installing and running software, modifying permissions, writing quick scripts or getting programs running.  We can help!


The CQLS Bioinformatics staff can assist in the development of whole shotgun sequencing efforts for ecosystem-level inquiries, environmental clone libraries for functional studies, amplicon variant detection for population analysis and single reference gene-based analysis for community ecology.  Our team has experience in specialized metagenomic tools for assembly (MetaVelvet, Meta-IDBA), annotation (Glimmer-MG, FragGeneScan) and taxon assignment (Metawatt, MetaBin, TANGO) to increase the information recovery from metagenomic environmental samples.

Phylogenetic Inference & Population Genetics

Evolution and relationships among genes and organisms can draw from multiple data sources.  Our bioinformaticists can assist in DNA or protein database construction and multiple sequence alignment for phylogenetic inference using distance-based, maximum likelihood and Bayesian modalities.  We can also assist in computational tests of models of evolution and phylogenetic reconstruction, tree topologies, and molecular clock analysis.  Furthermore, we can apply standard or customized statistical tools to population genetic data in order to detect selection pressure, quantitative trait loci, linkage disequilibrium and recombination events. 


Because research keeps us on the cutting edge and existing software may not accomplish your analysis needs, writing programs is an integral part of modern bioinformatics. We can teach you how to do it or code it for you.  We provide classes and workshops in programming for languages such as Python.  We can work one-on-one and guide self-directed learning in other languages like Perl, C, R, and markup languages like HTML, LaTeX, and Markdown.  If you need someone to do it for you, we can write new programs or alter existing software.

Sequence Analysis

One of the most basic needs in bioinformatics is comparative sequence analysis. We provide services and training in a wide variety of topics related to sequence analysis, including database searching (e.g. with BLAST, BLAT, and HMMER), pairwise and multiple alignment (e.g. via BioPython modules and tools like Muscle and ClustalW), and reference-guided alignment (e.g. BWA, TopHat/Bowtie).

Transcriptomics & RNAseq

High throughput RNA sequencing (RNA-Seq) provides information on the location, structure and quantity of genes expressed. RNAseq data from eukaryotes or prokaryote can be mapped to a reference genome or used in de novo (without a reference) transcriptome assembly using assemblers such as transAbyss, Trinity and Velvet/Oases. RNASeq is also useful for the quantification of alternative splicing, detection of allelic variation and for the improvement of genome assembly. Most RNASeq data sets are used for differential expression analysis, a powerful tool that allows for the detection of expression levels among various conditions, cell types and developmental stages. Our bioinformaticists work with researchers to identify the appropriate amount of replication and sequencing depth to recover robust statistical information that supports the biological evidence on the difference in expression in genes of interest, or across the whole genome.


XSEDE is a collection of computing resources that scientists can use to interactively access and share computing resources, data and expertise. It consists of supercomputers, high throughput computing, storage, cloud computing, software and support for scientific computing.

XSEDE is NSF funded and is available to all Oregon State researchers.  Allocations are available based on a proposal system. A startup allocation can be requested any time. Based on the performance of the code and appropriateness of the computations a proposal can be submitted for a full research allocation (there are 4 proposal periods per year).  XSEDE also offers specialized computing gateways, including bioscience gateways, that can be used at anytime without going through a proposal process.

The CGRB at Oregon State works with users to enable different pathways to access the XSEDE resources. We have many tools and methods developed to take full advantage of the valuable resource. To access XSEDE resources, go to  If you need assistance in gaining access to XSEDE resources, contact