|   | SEQWORDS documentation | 
| TY SCOP XX CL Alpha and beta proteins (a/b) XX FO NAD(P)-binding Rossmann-fold domains XX SF NAD(P)-binding Rossmann-fold domains XX FA Lactate & malate dehydrogenases, N-terminal domain XX TE NAD(P)-binding Rossmann-fold TE Lactate & malate dehydrogenases TE Lactate dehydrogenase TE Malate dehydrogenase // | 
| 
ID   ACEA_ECOLI     STANDARD;      PRT;   434 AA.
AC   P05313;
DT   01-NOV-1988 (Rel. 09, Created)
DT   01-NOV-1988 (Rel. 09, Last sequence update)
DT   15-DEC-1998 (Rel. 37, Last annotation update)
DE   ISOCITRATE LYASE (EC 4.1.3.1) (ISOCITRASE) (ISOCITRATASE) (ICL).
GN   ACEA OR ICL.
OS   Escherichia coli.
OC   Bacteria; Proteobacteria; gamma subdivision; Enterobacteriaceae;
OC   Escherichia.
RN   [1]
RP   SEQUENCE FROM N.A.
RC   STRAIN=K12;
RX   MEDLINE; 89083515.
RA   Byrne C.R., Stokes H.W., Ward K.A.;
RT   "Nucleotide sequence of the aceB gene encoding malate synthase A in
RT   Escherichia coli.";
RL   Nucleic Acids Res. 16:10924-10924(1988).
RN   [2]
RP   SEQUENCE FROM N.A.
RC   STRAIN=K12;
RX   MEDLINE; 88262573.
RA   Rieul C., Bleicher F., Duclos B., Cortay J.-C., Cozzone A.J.;
RT   "Nucleotide sequence of the aceA gene coding for isocitrate lyase in
RT   Escherichia coli.";
RL   Nucleic Acids Res. 16:5689-5689(1988).
RN   [3]
RP   SEQUENCE FROM N.A.
RX   MEDLINE; 89008064.
RA   Matsuoka M., McFadden B.A.;
RT   "Isolation, hyperexpression, and sequencing of the aceA gene encoding
RT   isocitrate lyase in Escherichia coli.";
RL   J. Bacteriol. 170:4528-4536(1988).
RN   [4]
RP   SEQUENCE FROM N.A.
RC   STRAIN=K12 / MG1655;
RX   MEDLINE; 94089392.
RA   Blattner F.R., Burland V.D., Plunkett G. III, Sofia H.J.,
RA   Daniels D.L.;
RT   "Analysis of the Escherichia coli genome. IV. DNA sequence of the
RT   region from 89.2 to 92.8 minutes.";
RL   Nucleic Acids Res. 21:5408-5417(1993).
RN   [5]
RP   SEQUENCE OF 293-434 FROM N.A.
RX   MEDLINE; 88227861.
RA   Klumpp D.J., Plank D.W., Bowdin L.J., Stueland C.S., Chung T.,
RA   Laporte D.C.;
RT   "Nucleotide sequence of aceK, the gene encoding isocitrate
RT   dehydrogenase kinase/phosphatase.";
RL   J. Bacteriol. 170:2763-2769(1988).
  [Part of this file has been deleted for brevity]
FT   CONFLICT     70     70       A -> R (IN REF. 2).
FT   CONFLICT     80     80       A -> R (IN REF. 1 AND 2).
FT   CONFLICT    116    116       I -> N (IN REF. 2).
FT   CONFLICT    144    144       F -> L (IN REF. 1).
FT   CONFLICT    305    312       LGEEFVNK -> WAKSSLISN (IN REF. 2).
FT   CONFLICT    307    307       E -> Q (IN REF. 1).
FT   STRAND        2      6
FT   TURN          7      9
FT   HELIX        11     23
FT   TURN         26     27
FT   STRAND       28     33
FT   TURN         37     38
FT   HELIX        39     47
FT   TURN         48     48
FT   STRAND       53     58
FT   HELIX        64     67
FT   TURN         68     69
FT   STRAND       72     75
FT   TURN         83     84
FT   HELIX        87    108
FT   TURN        110    111
FT   STRAND      113    116
FT   HELIX       121    134
FT   TURN        135    136
FT   TURN        140    141
FT   STRAND      143    145
FT   HELIX       148    162
FT   TURN        163    163
FT   HELIX       166    168
FT   STRAND      173    175
FT   TURN        179    181
FT   STRAND      182    184
FT   HELIX       186    188
FT   TURN        190    191
FT   HELIX       196    217
FT   TURN        218    219
FT   HELIX       225    242
FT   TURN        243    244
FT   STRAND      248    255
FT   STRAND      263    271
FT   TURN        272    273
FT   STRAND      274    278
FT   HELIX       286    311
SQ   SEQUENCE   312 AA;  32337 MW;  17741A3B5AD068BA CRC64;
     MKVAVLGAAG GIGQALALLL KTQLPSGSEL SLYDIAPVTP GVAVDLSHIP TAVKIKGFSG
     EDATPALEGA DVVLISAGVA RKPGMDRSDL FNVNAGIVKN LVQQVAKTCP KACIGIITNP
     VNTTVAIAAE VLKKAGVYDK NKLFGVTTLD IIRSNTFVAE LKGKQPGEVE VPVIGGHSGV
     TILPLLSQVP GVSFTEQEVA DLTKRIQNAG TEVVEAKAGG GSATLSMGQA AARFGLSLVR
     ALQGEQGVVE CAYVEGDGQY ARFFSQPLLL GKNGVEERKS IGTLSAFEQN ALEGMLDTLK
     KDIALGEEFV NK
//
 | 
| > Q60150^.^1^312^SCOP^.^0^Alpha and beta proteins (a/b)^.^.^NAD(P)-binding Rossmann-fold domains^NAD(P)-binding Rossmann-fold domains^Lactate & malate dehydrogenases, N-terminal domain^KEYWORD^0.00^0.000e+00^0.000e+00 MKVAVLGAAGGIGQALALLLKTQLPSGSELSLYDIAPVTPGVAVDLSHIPTAVKIKGFSGEDATPALEGADVVLISAGVARKPGMDRSDLFNVNAGIVKNLVQQVAKTCPKACIGIITNPVNTTVAIAAEVLKKAGVYDKNKLFGVTTLDIIRSNTFVAELKGKQPGEVEVPVIGGHSGVTILPLLSQVPGVSFTEQEVADLTKRIQNAGTEVVEAKAGGGSATLSMGQAAARFGLSLVRALQGEQGVVECAYVEGDGQYARFFSQPLLLGKNGVEERKSIGTLSAFEQNALEGMLDTLKKDIALGEEFVNK | 
| 
Generate DHF files from keyword search of UniProt.
Version: EMBOSS:6.6.0.0
   Standard (Mandatory) qualifiers:
  [-keyfile]           infile     This option specifies the name of keywords
                                  file (input). This contains a list of
                                  keywords specific to a number of SCOP or
                                  CATH families and superfamilies used by
                                  SEQWORDS to search a sequence database.
  [-spfile]            infile     This option specifies the name of the
                                  sequence database (input) to search.
  [-outfile]           outfile    [test.hits] This option specifies the name
                                  of the DHF file (domain hits file) (output).
                                  A 'domain hits file' contains database hits
                                  (sequences) with domain classification
                                  information, in the DHF format (FASTA-like).
                                  The hits are relatives to a SCOP or CATH
                                  family (or other node in the structural
                                  hierarchies) and are found from a search of
                                  a sequence database. Files containing hits
                                  retrieved by PSIBLAST are generated by using
                                  SEQSEARCH, hits retrieved by a sparse
                                  protein signatare by using SIGSCAN or
                                  various types of HMM and profile by using
                                  LIBSCAN.
   Additional (Optional) qualifiers: (none)
   Advanced (Unprompted) qualifiers: (none)
   Associated qualifiers:
   "-outfile" associated qualifiers
   -odirectory3        string     Output directory
   General qualifiers:
   -auto               boolean    Turn off prompts
   -stdout             boolean    Write first file to standard output
   -filter             boolean    Read first file from standard input, write
                                  first file to standard output
   -options            boolean    Prompt for standard and additional values
   -debug              boolean    Write debug output to program.dbg
   -verbose            boolean    Report some/full command line options
   -help               boolean    Report command line options and exit. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose
   -warning            boolean    Report warnings
   -error              boolean    Report errors
   -fatal              boolean    Report fatal errors
   -die                boolean    Report dying program messages
   -version            boolean    Report version number and exit
 | 
| Qualifier | Type | Description | Allowed values | Default | 
|---|---|---|---|---|
| Standard (Mandatory) qualifiers | ||||
| [-keyfile] (Parameter 1) | infile | This option specifies the name of keywords file (input). This contains a list of keywords specific to a number of SCOP or CATH families and superfamilies used by SEQWORDS to search a sequence database. | Input file | Required | 
| [-spfile] (Parameter 2) | infile | This option specifies the name of the sequence database (input) to search. | Input file | Required | 
| [-outfile] (Parameter 3) | outfile | This option specifies the name of the DHF file (domain hits file) (output). A 'domain hits file' contains database hits (sequences) with domain classification information, in the DHF format (FASTA-like). The hits are relatives to a SCOP or CATH family (or other node in the structural hierarchies) and are found from a search of a sequence database. Files containing hits retrieved by PSIBLAST are generated by using SEQSEARCH, hits retrieved by a sparse protein signatare by using SIGSCAN or various types of HMM and profile by using LIBSCAN. | Output file | test.hits | 
| Additional (Optional) qualifiers | ||||
| (none) | ||||
| Advanced (Unprompted) qualifiers | ||||
| (none) | ||||
| Associated qualifiers | ||||
| "-outfile" associated outfile qualifiers | ||||
| -odirectory3 -odirectory_outfile | string | Output directory | Any string | |
| General qualifiers | ||||
| -auto | boolean | Turn off prompts | Boolean value Yes/No | N | 
| -stdout | boolean | Write first file to standard output | Boolean value Yes/No | N | 
| -filter | boolean | Read first file from standard input, write first file to standard output | Boolean value Yes/No | N | 
| -options | boolean | Prompt for standard and additional values | Boolean value Yes/No | N | 
| -debug | boolean | Write debug output to program.dbg | Boolean value Yes/No | N | 
| -verbose | boolean | Report some/full command line options | Boolean value Yes/No | Y | 
| -help | boolean | Report command line options and exit. More information on associated and general qualifiers can be found with -help -verbose | Boolean value Yes/No | N | 
| -warning | boolean | Report warnings | Boolean value Yes/No | Y | 
| -error | boolean | Report errors | Boolean value Yes/No | Y | 
| -fatal | boolean | Report fatal errors | Boolean value Yes/No | Y | 
| -die | boolean | Report dying program messages | Boolean value Yes/No | Y | 
| -version | boolean | Report version number and exit | Boolean value Yes/No | N | 
| % seqwords Generate DHF files from keyword search of UniProt. Keywords file: seqwords.terms Swissprot-format database file: seqwords.seq Domain hits output file [test.hits]: seqwords.dhf | 
Go to the input files for this example
Go to the output files for this example
| FILE TYPE | FORMAT | DESCRIPTION | CREATED BY | SEE ALSO | 
| Domain hits file | DHF format (FASTA-like). | Database hits (sequences) with domain classification information. The hits are relatives to a SCOP or CATH family (or other node in the structural hierarchies) and are found from a search of a discriminating element (e.g. a protein signature, hidden Markov model, simple frequency matrix, Gribskov profile or Hennikoff profile) against a sequence database. | SEQSEARCH (hits retrieved by PSIBLAST). SIGSCAN (hits retrieved by sparse protein signature). LIBSCAN (hits retrieved by various types of HMM and profile). | N.A. | 
| Keywords file | Text | Contains a list of keywords specific to a number of SCOP families and superfamilies used by SEQWORDS to search a sequence database. | N.A. | N.A. | 
| Program name | Description | 
|---|---|
| cathparse | Generate DCF file from raw CATH files | 
| domainalign | Generate alignments (DAF file) for nodes in a DCF file | 
| domainnr | Remove redundant domains from a DCF file | 
| domainrep | Reorder DCF file to identify representative structures | 
| domainseqs | Add sequence records to a DCF file | 
| domainsse | Add secondary structure records to a DCF file | 
| helixturnhelix | Identify nucleic acid-binding motifs in protein sequences | 
| libgen | Generate discriminating elements from alignments | 
| matgen3d | Generate a 3D-1D scoring matrix from CCF files | 
| pepcoil | Predict coiled coil regions in protein sequences | 
| rocon | Generate a hits file from comparing two DHF files | 
| rocplot | Perform ROC analysis on hits files | 
| scopparse | Generate DCF file from raw SCOP files | 
| seqalign | Extend alignments (DAF file) with sequences (DHF file) | 
| seqfraggle | Remove fragment sequences from DHF files | 
| seqsort | Remove ambiguous classified sequences from DHF files | 
| ssematch | Search a DCF file for secondary structure matches | 
See also http://emboss.sourceforge.net/