Ngila is a global alignment program that can align pairs of sequences using logarithmic and affine gap penalties.
Cartwright RA (2007) Ngila: global pairwise alignments with logarithmic and affine gap costs. Bioinformatics. 23(11):1427-1428 link (free)
Download and Install
For Windows, there is a binary installer.
Compiling from Source
Download portable source code. Extract the archive and run
cmake . && make
Pay careful attention to any error messages. The most common failure is not finding the right version of Boost. Sometimes enabling static compilation of Boost will work:
rm CMakeCache.txt; cmake -DBoost_USE_STATIC_LIBS=ON . && make
If building is successful, you will find the binary at src/ngila. If you have installation privileges, you can install Ngila globally using
Read the readme.txt file for more information. Ngila's use of CMake is new, so if you are having problems please email for support.
Version 1.3 has been release. This version fixes a few bugs and includes many new features.
- Use CMake for compilation and installation
- New scaling option enabled by default (identical sequences default to cost of 0)
- Protein evolutionary models: aazeta and aageo
- Fasta and Phylip format output support
- Clustal and Phylip format input support
- Report sequence identity measure
- Matrix output formats for distance measures
- Look for "ngilarc" file in the home directory.
- New separator option
- New const-align option
- Replace arg-file option with ngilarc option.
- Use custom zeta function if GSL not found.
- Optimize size of travel table.
- Ordering of --pairs-all fixed
- bug fix for output of large alignments >10kb
- minor bug fix for geo model
Ngila has had many improvements (and bug fixes) over the last year, but a new release hasn't been created (yet). If you are doing alignment research, I suggest that you try compiling and using the source code on the current branch.
svn co svn://scit.us/ngila/current
Version 1.2.1 has been released. This version fixes two bugs in the previous release.
- Spaces in Fasta sequence names converted to underscores.
- Sequence names are now properly truncated for output.
Version 1.2 has been released. This version adds two new evolutionary models for specifying costs, while redesigning much of the code to be more modular. Specific changes:
- Changed license to GPL
- Reorganized source code to a more modular framework.
- Compiling from source now requires GSL and Boost Libraries
- Added two evolutionary models: K2P+Zeta and K2P+Geometric
- Command line options changed
- Only FASTA supported for input
- Tie breaking has changed
- The Clustal output format has been altered
- Man page updated and tied to cmds file
- Added pairs setting for more control of sequential alignments
- Several Bug fixes
ngila -m zeta -t 0.1 -k 2.0 -r 0.05 -z 1.65 sequences.fasta
Ngila implements a Miller and Myers (1988) candidate list method of sequence alignment with the gap cost being of the form g(x) = a + b*x + c*ln x. Ngila will return the alignment with the minimum cost and has rules for breaking ties. Ngila's main alignment algorithm is divide-and-conquer, which requires O(M) memory; but slower than a holistic, O(MN) memory algorithm.
Ngila implements a secondary, holistic algorithm for alignment, which is faster. The options -M and -N (-M is for the larger sequence) allow users to specify thresholds for when the holistic algorithm is used instead of the DnC algorithm. For example, command 'ngila -M 5000 -N 5000 seqs.fasta' will align the sequences in 'seqs.fasta' via the divide-and-conquer algorithm, but when subsequences less than or equal to 5000-5000 are being aligned, the holistic algorithm will be used.
PICS-Ord is an algorithm to extract phylogenetic information from hard-to-align regions of multiple sequence alignments.