
                                  complex 



Function

   Find the linguistic complexity in nucleotide sequences

Description

Usage

   Here is a sample session with complex


% complex -omnia 
Find the linguistic complexity in nucleotide sequences
Input nucleotide sequence(s): tembl:*
Window length [100]: 
Step size [5]: 
Minimum word length [4]: 
Maximum word length [6]: 
Output file [hs989235.complex]: 
output sequence(s) [hs989235.fasta]: 

   Go to the input files for this example
   Go to the output files for this example

Command line arguments

   Standard (Mandatory) qualifiers (* if not always prompted):
  [-sequence]          seqall     Nucleotide sequence(s) filename and optional
                                  format, or reference (input USA)
   -lwin               integer    [100] Window length (Any integer value)
   -step               integer    [5] Displacement of the window over the
                                  sequence (Any integer value)
   -jmin               integer    [4] Minimum word length (Integer from 2 to
                                  20)
   -jmax               integer    [6] Maximum word length (Integer from 2 to
                                  50)
  [-outfile]           outfile    [*.complex] Output file name
*  -outseq             seqoutall  [.] Sequence set(s)
                                  filename and optional format (output USA)

   Additional (Optional) qualifiers: (none)
   Advanced (Unprompted) qualifiers:
   -omnia              toggle     [N] Calculate over a set of sequences
   -sim                integer    [0] Calculate the linguistic complexity by
                                  comparison with a number of simulations
                                  having a uniform distribution of bases (Any
                                  integer value)
   -freq               boolean    [N] Execute the simulation of a sequence
                                  based on the base frequency of the original
                                  sequence
   -print              boolean    [N] Generate a file named UjTable containing
                                  the values of Uj for each word j in the
                                  real sequence(s) and in any simulated
                                  sequences
   -ujtablefile        outfile    [complex.ujtable] Program complex temporary
                                  output file

   Associated qualifiers:

   "-sequence" associated qualifiers
   -sbegin1            integer    Start of each sequence to be used
   -send1              integer    End of each sequence to be used
   -sreverse1          boolean    Reverse (if DNA)
   -sask1              boolean    Ask for begin/end/reverse
   -snucleotide1       boolean    Sequence is nucleotide
   -sprotein1          boolean    Sequence is protein
   -slower1            boolean    Make lower case
   -supper1            boolean    Make upper case
   -sformat1           string     Input sequence format
   -sdbname1           string     Database name
   -sid1               string     Entryname
   -ufo1               string     UFO features
   -fformat1           string     Features format
   -fopenfile1         string     Features file name

   "-outfile" associated qualifiers
   -odirectory2        string     Output directory

   "-ujtablefile" associated qualifiers
   -odirectory         string     Output directory

   "-outseq" associated qualifiers
   -osformat           string     Output seq format
   -osextension        string     File name extension
   -osname             string     Base file name
   -osdirectory        string     Output directory
   -osdbname           string     Database name to add
   -ossingle           boolean    Separate file for each entry
   -oufo               string     UFO features
   -offormat           string     Features format
   -ofname             string     Features file name
   -ofdirectory        string     Output directory

   General qualifiers:
   -auto               boolean    Turn off prompts
   -stdout             boolean    Write standard output
   -filter             boolean    Read standard input, write standard output
   -options            boolean    Prompt for standard and additional values
   -debug              boolean    Write debug output to program.dbg
   -verbose            boolean    Report some/full command line options
   -help               boolean    Report command line options. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose
   -warning            boolean    Report warnings
   -error              boolean    Report errors
   -fatal              boolean    Report fatal errors
   -die                boolean    Report dying program messages

Input file format

  Input files for usage example

   'tembl:*' is a sequence entry in the example nucleic acid database
   'tembl'

Output file format

   Sequence TEMBL:HHTETRA contains repeats and is included in the test
   database for repeat analysis.

  Output files for usage example

  File: complex.ujtable

  File: hs989235.complex

Length of window : 100
jmin : 4
jmax : 6
step : 5
Execution without simulation
----------------------------------------------------------------------------
|                  |                  |                  |                  |
|     number of    |      name of     |     length of    |      value of    |
|     sequence     |     sequence     |     sequence     |     complexity   |
|                  |                  |                  |                  |
----------------------------------------------------------------------------
         1                    HS989235            495             0.7210
         2                    AB009602            561             0.6688
         3                      HSCAD5           3170             0.6921
         4                         HSD            781             0.6991
         5                      HSEGL1           3919             0.6618
         6                       HSFAU            518             0.6739
         7                      HSFAU1           2016             0.7105
         8                       HSFOS           6210             0.6681
         9                       HSEF2           3075             0.6925
        10                        HSHT           1658             0.7314
        11                       HSTS1          18596             0.6668
        12                      HSNFG9          33760             0.6661
        13                    AB000095           2399             0.6569
        14                    AB009062            532             0.6465
        15                     HSFERG1            512             0.5609
        16                     HSFERG2           1132             0.7217
        17                    AC004629         116019             0.6478
        18                    AP000504         100000             0.6611
        19                    AF129756         184666             0.6562
        20                    AB000360           2582             0.6710
        21                       HSHBB          73308             0.6544
        22                     CEZK637          40699             0.6307
        23                      PDRHOD           1675             0.6201
        24                    AAHSP70B           4712             0.7110
        25                      GMGL01           3400             0.3981
        26                    GMLLBPS1            852             0.5986
        27                    GMLLBPS2           1698             0.5225
        28                       ECLAC           7477             0.7137
        29                      ECLACA           1832             0.6916
        30                      ECLACI           1113             0.7480
        31                      ECLACY           1500             0.6801
        32                      ECLACZ           3078             0.7278
        33                      PAAMIB           1212             0.6596
        34                      PAAMIE           1065             0.6418
        35                      PAAMIR           2167             0.6562
        36                      PAAMIS           1130             0.6989
        37                        MMAM            366             0.7163
        38                       RNOPS           1493             0.6571
        39                    RNU68037           1218             0.6381
        40                   HSA203YC1            389             0.4327
        41                     HHTETRA           1272             0.3114
        42                    XLRHODOP           1684             0.7193
        43                     XL23808           4734             0.7180
        44                    AF123456           1510             0.5913
        45                    AF123457           1634             0.7254

  File: hs989235.fasta

>HS989235 H45989.1 yo13c02.s1 Soares adult brain N2b5HB55Y Homo sapiens cDNA cl
one IMAGE:177794 3', mRNA sequence.
ccggnaagctcancttggaccaccgactctcgantgnntcgccgcgggagccggntggan
aacctgagcgggactggnagaaggagcagagggaggcagcacccggcgtgacggnagtgt
gtggggcactcaggccttccgcagtgtcatctgccacacggaaggcacggccacgggcag
gggggtctatgatcttctgcatgcccagctggcatggccccacgtagagtggnntggcgt
ctcggtgctggtcagcgacacgttgtcctggctgggcaggtccagctcccggaggacctg
gggcttcagcttcccgtagcgctggctgcagtgacggatgctcttgcgctgccatttctg
ggtgctgtcactgtccttgctcactccaaaccagttcggcggtccccctgcggatggtct
gtgttgatggacgtttgggctttgcagcaccggccgccgagttcatggtngggtnaagag
atttgggttttttcn
>AB009602 AB009602.1 Schizosaccharomyces pombe mRNA for MET1 homolog, partial c
ds.
gttcgatgcctaaaataccttcttttgtccctacacagaccacagttttcctaatggctt
tacaccgactagaaattcttgtgcaagcactaattgaaagcggttggcctagagtgttac
cggtttgtatagctgagcgcgtctcttgccctgatcaaaggttcattttctctactttgg
aagacgttgtggaagaatacaacaagtacgagtctctcccccctggtttgctgattactg
gatacagttgtaatacccttcgcaacaccgcgtaactatctatatgaattattttccctt
tattatatgtagtaggttcgtctttaatcttcctttagcaagtcttttactgttttcgac
ctcaatgttcatgttcttaggttgttttggataatatgcggtcagtttaatcttcgttgt
ttcttcttaaaatatttattcatggtttaatttttggtttgtacttgttcaggggccagt
tcattatttactctgtttgtatacagcagttcttttatttttagtatgattttaatttaa
aacaattctaatggtcaaaaa
>HSCAD5 X59796.1 H.sapiens mRNA for cadherin-5
ctccactcacgctcagccctggacggacaggcagtccaacggaacagaaacatccctcag
cccacaggcacgatctgttcctcctgggaagatgcagaggctcatgatgctcctcgccac
atcgggcgcctgcctgggcctgctggcagtggcagcagtggcagcagcaggtgctaaccc
tgcccaacgggacacccacagcctgctgcccacccaccggcgccaaaagagagattggat
ttggaaccagatgcacattgatgaagagaaaaacacctcacttccccatcatgtaggcaa
gatcaagtcaagcgtgagtcgcaagaatgccaagtacctgctcaaaggagaatatgtggg
caaggtcttccgggtcgatgcagagacaggagacgtgttcgccattgagaggctggaccg
ggagaatatctcagagtaccacctcactgctgtcattgtggacaaggacactggcgaaaa
cctggagactccttccagcttcaccatcaaagttcatgacgtgaacgacaactggcctgt
gttcacgcatcggttgttcaatgcgtccgtgcctgagtcgtcggctgtggggacctcagt
catctctgtgacagcagtggatgcagacgaccccactgtgggagaccacgcctctgtcat
gtaccaaatcctgaaggggaaagagtattttgccatcgataattctggacgtattatcac
aataacgaaaagcttggaccgagagaagcaggccaggtatgagatcgtggtggaagcgcg
agatgcccagggcctccggggggactcgggcacggccaccgtgctggtcactctgcaaga
catcaatgacaacttccccttcttcacccagaccaagtacacatttgtcgtgcctgaaga
cacccgtgtgggcacctctgtgggctctctgtttgttgaggacccagatgagccccagaa
ccggatgaccaagtacagcatcttgcggggcgactaccaggacgctttcaccattgagac
aaaccccgcccacaacgagggcatcatcaagcccatgaagcctctggattatgaatacat
ccagcaatacagcttcatagtcgaggccacagaccccaccatcgacctccgatacatgag
ccctcccgcgggaaacagagcccaggtcattatcaacatcacagatgtggacgagccccc
cattttccagcagcctttctaccacttccagctgaaggaaaaccagaagaagcctctgat
tggcacagtgctggccatggaccctgatgcggctaggcatagcattggatactccatccg
caggaccagtgacaagggccagttcttccgagtcacaaaaaagggggacatttacaatga
gaaagaactggacagagaagtctacccctggtataacctgactgtggaggccaaagaact
ggattccactggaacccccacaggaaaagaatccattgtgcaagtccacattgaagtttt
ggatgagaatgacaatgccccggagtttgccaagccctaccagcccaaagtgtgtgagaa
cgctgtccatggccagctggtcctgcagatctccgcaatagacaaggacataacaccacg
aaacgtgaagttcaaattcatcttgaatactgagaacaactttaccctcacggataatca


  [Part of this file has been deleted for brevity]

gagccagttgtcaagaagagcagcagcagcagctcctgtctcctgcaggacagcagcagc
cctgctcactccacgagcacggtggcagcagcagcagcgagcgcaccaccagagggacgg
atgctcattcaggacatcccttccatccccagcagagggcacttggagagcacgtctgat
ttggttgtggactccacctactacagcagtttttaccagccatccctgtatccttactat
aacaacctgtacaactactcccagtaccaaatggcagtggccactgagtcttcctcaagt
gagacagggggtacgtttgtagggtcagccatgaaaaacagccttcgaagcctcccagca
acatacatgtcaagccagtcaggaaaacagtggcagatgaagggaatggagaaccgccat
gccatgagctcccagtaccggatgtgctcctactacccgcccacctcatacctgggccag
ggggttggcagtcccacctgcgtcacacagatactggcctcggaggacaccccctcctac
tcagagtcgaaagcgagagtgttttcgccgcccagcagccaggactcgggcctggggtgc
ctgtcgagcagcgagagcaccaagggagacctggagtgcgagccccaccaagagcccggc
gccttcgcggtgagcccggttcttgagggcgagtaggcgcggcgtcgggcggctgctgcg
cggcgttcactgttgccttgttctgttggggttgcgggggggcgttgggtttcttctttc
cggggcggggggggcacggcggggccgcggccgggccggcggggcggggcggggcgggac
ggggcggggcggagccgcgcgggggccgcagtccgggccggggccgccgtcgggtctcgg
cccgctcccgtcggggcggagcgtccgacgatcggcctccacgaaacgcggtgccgtgat
gtgtttgtagtggttcctcgtaggctccagacgttttctcctcgtatcgccaaattaacg
cgttttgcatattacagttgagtgcctcgacttagattgcaatataagcggccagcaaac
aagtctcaaaaaaaagttacgtgcgtttctgcgagtgttattttgttaagaacggctcac
agtgtcctcttcctgtgttacagaagccaacctgaaatgaaactagtctggaaaaattca
ttgttctctgtagttgcagctgtacctgaaataaaaatgttattgatgactgaaaaaaaa
aaaaaaaaaa
>AF123457 AF123457.1 Toxoplasma gondii enolase (ENO2) mRNA, complete cds.
tctaccgttactcaacttccaacaaaatggtggccatcaaggacatcactgctcgtcaga
tcctcgactcccgaggaaacccgaccgtcgaggttgacttgttgaccgatggcggctgct
tccgtgccgctgtccccagcggcgcatccactggcatctacgaggcgcttgagctccgtg
acaaggaccaaactaagttcatgggcaagggtgtgatgaaggccgtggagaacatccaca
agattatcaagccggcgcttattggcaaggacccgtgcgaccagaagggtattgacaagc
tgatggtcgaggagctcgatggaactaagaacgagtggggctggtgcaagtcgaagctcg
gcgcgaacgcgatcctggccgtctcgatggcttgctgccgcgccggcgctgctgccaagg
gcatgcccctgtacaagtacattgccactttggctggaaacccgacagacaagatggtaa
tgcccgtcccgttcttcaacgtcatcaacggcggctcccacgcaggcaacaaggtcgcga
tgcaggagttcatgatcgcccccgtcggcgcctccacaatccaagaggcgatccagatcg
gcgcggaagtgtaccagcacctgaaggtcgtcattaagaagaagtatggcctcgacgcca
cgaacgtcggcgacgagggtggcttcgcccccaacatcagcggcgccacggaggccctcg
acttgctgatggaggccatcaaggtgtctggtcacgaaggcaaggtcaagattgccgccg
acgtcgccgcttccgagttcttcctccaggacgacaaagtctatgacctagacttcaaga
ctccgaacaacgacaagtcgcaacgcaagactggcgaagagcttcgcaacctgtacaagg
acctgtgccagaagtatcccttcgtgtccatcgaggacccgttcgaccaggacgacttcc
acagctacgctcagctcaccaacgaggttggcgagaaggtccaaatcgtcggcgacgacc
tcctggtcaccaacccgacgcgcattgagaaggccgttcaggagaaagcgtgcaacggcc
tgcttctcaaggtgaaccagattggcacagtcagcgagtctatcgaggcctgccagcttg
cccagaagaacaagtggggcgttatggtttctcaccgctccggtgagactgaggactcct
tcatcgctgacctcgtcgtcggtctccgcaccgggcaaatcaagactggcgccccgtgca
gatccgagcgtctctgcaagtacaaccagctgatgcgtatcgaagagtcgctcggctccg
actgtcagtacgccggcgctggcttccgccatcccaactaagtggaaacggagtttcgac
tacccaactgctcaattggggctgggtggtttgtccactctgcaacaagggcgtgacgag
atcgttgcacatgcaactgccttttttgtgcttggtgggaaggagcactttcgcaggtgc
agcaccgagttgcggttgatgggaatttcggaactgatttgtttcttgcatgccatcacc
gaaggaacgagcagtttcgttgataatattggaaagtcttttgaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaa

Data files

   None

Notes

   None

References

   None

Warnings

   None

Diagnostic Error Messages

   None

Exit status

   Always exits with status 0

Known bugs

   None

See also

   Program name                        Description
   banana       Bending and curvature plot in B-DNA
   btwisted     Calculates the twisting in a B-DNA sequence
   chaos        Create a chaos game representation plot for a sequence
   compseq      Count composition of dimer/trimer/etc words in a sequence
   dan          Calculates DNA RNA/DNA melting temperature
   freak        Residue/base frequency table or plot
   isochore     Plots isochores in large DNA sequences
   sirna        Finds siRNA duplexes in mRNA
   wordcount    Counts words of a specified size in a DNA sequence

Author(s)

Target users

   This program is intended to be used by everyone and everything, from
   naive users to embedded scripts.

Comments

   None
