Documentation

Here we present detail instructions of using Hoea as an installed python module.

1, when you download the source tarbar, decompress it, and changed into the newly created Hoea-version directory.

2, run following command to install the module. (make sure you have enough priviledge if you want to install it into a system directory, we assume you had here).( $ as the shell command prompt, same as below).

$ sudo python setup.py install
running install
running build
running build_py
creating build
creating build/lib
creating build/lib/Hoea
copying src/Hoea/KOIterator.py -> build/lib/Hoea
copying src/Hoea/GOIterator.py -> build/lib/Hoea
copying src/Hoea/Hoea.py -> build/lib/Hoea
copying src/Hoea/__init__.py -> build/lib/Hoea
creating build/lib/Hoea/data
copying src/Hoea/data/ko.rel -> build/lib/Hoea/data
copying src/Hoea/data/ko.def -> build/lib/Hoea/data
copying src/Hoea/data/go.rel -> build/lib/Hoea/data
copying src/Hoea/data/go.def -> build/lib/Hoea/data
creating build/lib/Hoea/sample
copying src/Hoea/sample/mt_probe_go.txt -> build/lib/Hoea/sample
copying src/Hoea/sample/probe_6h_up.txt -> build/lib/Hoea/sample
running build_scripts
creating build/scripts-2.4
copying and adjusting Hoea -> build/scripts-2.4
copying and adjusting GOIterator -> build/scripts-2.4
copying and adjusting KOIterator -> build/scripts-2.4
changing mode of build/scripts-2.4/Hoea from 644 to 755
changing mode of build/scripts-2.4/GOIterator from 644 to 755
changing mode of build/scripts-2.4/KOIterator from 644 to 755
running install_lib
creating /usr/local/lib/python2.4/site-packages/Hoea
copying build/lib/Hoea/KOIterator.py -> /usr/local/lib/python2.4/site-packages/Hoea
copying build/lib/Hoea/GOIterator.py -> /usr/local/lib/python2.4/site-packages/Hoea
copying build/lib/Hoea/Hoea.py -> /usr/local/lib/python2.4/site-packages/Hoea
copying build/lib/Hoea/__init__.py -> /usr/local/lib/python2.4/site-packages/Hoea
creating /usr/local/lib/python2.4/site-packages/Hoea/sample
copying build/lib/Hoea/sample/mt_probe_go.txt -> /usr/local/lib/python2.4/site-packages/Hoea/sample
copying build/lib/Hoea/sample/probe_6h_up.txt -> /usr/local/lib/python2.4/site-packages/Hoea/sample
creating /usr/local/lib/python2.4/site-packages/Hoea/data
copying build/lib/Hoea/data/ko.rel -> /usr/local/lib/python2.4/site-packages/Hoea/data
copying build/lib/Hoea/data/ko.def -> /usr/local/lib/python2.4/site-packages/Hoea/data
copying build/lib/Hoea/data/go.rel -> /usr/local/lib/python2.4/site-packages/Hoea/data
copying build/lib/Hoea/data/go.def -> /usr/local/lib/python2.4/site-packages/Hoea/data
byte-compiling /usr/local/lib/python2.4/site-packages/Hoea/KOIterator.py to KOIterator.pyc
byte-compiling /usr/local/lib/python2.4/site-packages/Hoea/GOIterator.py to GOIterator.pyc
byte-compiling /usr/local/lib/python2.4/site-packages/Hoea/Hoea.py to Hoea.pyc
byte-compiling /usr/local/lib/python2.4/site-packages/Hoea/__init__.py to __init__.pyc
running install_scripts
copying build/scripts-2.4/KOIterator -> /usr/local/bin
copying build/scripts-2.4/Hoea -> /usr/local/bin
copying build/scripts-2.4/GOIterator -> /usr/local/bin
changing mode of /usr/local/bin/KOIterator to 755
changing mode of /usr/local/bin/Hoea to 755
changing mode of /usr/local/bin/GOIterator to 755
	
if no error occurs, then you have installed Hoea sucessfully, congratulate!

3, after installation, three command were available as shown on Tutorial, the new three commands are: Hoea, GOIterator and KOIterator, run the command without any options will give the help message like:

$ GOIterator
usage: GOIterator -g gene_ontology_ext.obo -r go.relation -d go.definition

GOIterator: error: must specify the GO file,like gene_ontology_ext.obo,use -h to see parameters
there commands could be used same as the scripts where the Tutorial had shown.

4, the GOIterator module
the GOIterator module is actually used to parse the gene_ontology*.obo file. so firsr we download the gene_ontology_ext.obo file from: ftp://ftp.geneontology.org/go/ontology/obo_format_1_2/gene_ontology_ext.obo. Lets resume we had downloaded the newest gene_ontology_ext.obo file and placed it in the current directory, and we would open a Python interpreter to parse this file.

$ python
Python 2.4.4 (#1, Mar 11 2008, 23:13:49)
[GCC 3.4.6 20060404 (Red Hat 3.4.6-8)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from Hoea import GOIterator as GI
>>> gis = GI.GOIterator('gene_ontology_ext.obo')
>>> gos = gis.parse()
>>> firstGO = gos.next()
>>> firstGO.id
'GO:0000001'
the firstGO is a GORecord object contains owns many attributes, like id means this GO term's id. many other attributes are available:
>>> firstGO.name
'mitochondrion inheritance'
>>> firstGO.is_a
['GO:0048308 ! organelle inheritance', 'GO:0048311 ! mitochondrion distribution']
the is_a attributes indicates this GO term is_a GO:0048308 and GO:0048311, in Hoea we named GO:0048308 and GO:0048311 are parents term of GO:0000001.
>>> firstGO.parents
[<Hoea.GOIterator.GORelationship instance at 0xb7f68dec>, <Hoea.GOIterator.GORelationship instance at 0xb7f68e0c>]
>>> firstGO.parents[0].id
'GO:0048308'
>>> firstGO.parents[0].relationship
'is_a'
>>> firstGO.parents[0].name
'organelle inheritance'
the parents attribute returns a list of parents of this term. each member of this list is a GORelationship object which owns id, name and relatioship attributes.
>>> secondGO = gos.next()
>>> secondGO.id
'GO:0000002'
>>> secondGO.name,secondGO.namespace
('mitochondrial genome maintenance', 'biological_process')
for example if you would like to generate the GO definition file, you can use a for loop similar like this:
>>> for i in gos:
...     print i.id,i.name
Note: firstGO.info will print a dictionary tells all attributes of this GO term.

5, the KOIterator module
the KOIterator module works similar like GOIterator module, while every KO record would have diffenent properties. we get the ko file from ftp://ftp.genome.jp/pub/kegg/genes/ko and put it at current directory.

$ python
Python 2.4.4 (#1, Mar 11 2008, 23:13:49)
[GCC 3.4.6 20060404 (Red Hat 3.4.6-8)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from Hoea import KOIterator as KI
>>> kis = KI.KOIterator('ko')
>>> kos = kis.parse()
>>> firstKO = kos.next()
>>> firstKO.entry_id
'K00001'
>>> firstKO.definition
'alcohol dehydrogenase [EC:1.1.1.1]'
>>> firstKO.pathway
['ko00010 Glycolysis / Gluconeogenesis', 'ko00071  Fatty acid metabolism ko00071  Fatty acid metabolism', 'ko00350  Tyrosine metabolism ko00350  Tyrosine metabolism', 'ko00625  Chloroalkane and chloroalkene degradation ko00625  Chloroalkane and chloroalkene degradation', 'ko00626  Naphthalene degradation ko00626  Naphthalene degradation', 'ko00830  Retinol metabolism ko00830  Retinol metabolism', 'ko00980  Metabolism of xenobiotics by cytochrome P450 ko00980  Metabolism of xenobiotics by cytochrome P450', 'ko00982  Drug metabolism - cytochrome P450 ko00982  Drug metabolism - cytochrome P450']
>>>
pathway attribute is a list of pathways this KO term involved. Note: firstKO.d will print a dictionary tells all attributes of this KO term.

Let's start a full analysis with Hoea from the beginning

Before start, we assume that you have installed Hoea successfully. as a repeat, we just provide some command example shows how to install Hoea (comments are after the # symbol)

$ wget http://sourceforge.net/projects/hoea/files/0.2/Hoea-0.2.tar.gz/download #download
$ tar zxvf Hoea-0.2.tar.gz #decompress
$ cd Hoea-0.2 #change directory
$ python setup.py install #installation process, might need some priviledges
Lets start this journey. Files we already have currently:
$ ls
all.go  all.ko  com2up.id
first 10 lines of this file looks like as:
$ head all.ko
1_scaffold_10016-1      K12867
1_scaffold_19693-0      K08803
1_scaffold_4523-0       K02149
1_scaffold_25008-0      K10257
1_scaffold_22510-0      K04077
1_scaffold_17237-0      K03063
1_scaffold_5829-1       K11975
1_scaffold_13429-0      K10712
1_scaffold_4963-0       K01714
1_scaffold_20563-0      K13468
$ head all.go
1_scaffold_4605-0       0006915
1_scaffold_35814-0      0006418
1_scaffold_35814-0      0006412
3_scaffold_17977-0      0008152
3_scaffold_17578-0      0006468
3_scaffold_17578-0      0006468
2_scaffold_9265-0       0007165
2_scaffold_9265-0       0045087
3_scaffold_7175-0       0009416
4_scaffold_3554-0       0030001
$ head com2up.id
3_scaffold_7080-0
3_scaffold_43440-0
2_contig35846
1_scaffold_6859-1
1_scaffold_32033-0
1_scaffold_8190-0
3_scaffold_10050-0
3_scaffold_5957-0
2_scaffold_4660-0
3_scaffold_13583-0
the com2up.id file is the input file for enrichment analysis, the all.ko and all.go is the KO and GO annotation files, respectively.
Now, we are about to generate KO and GO relationship file and definition file, first download the frequently-updated ho file:
$ wget -b ftp://ftp.geneontology.org/go/ontology/obo_format_1_2/gene_ontology_ext.obo
$ wget -b ftp://ftp.genome.jp/pub/kegg/genes/ko
run command GOIterator and KOIterator as below to generate files:
$ GOIterator -g gene_ontology_ext.obo -r go.rel -d go.def
$ KOIterator -k ko -r ko.rel -d ko.def
see what the directory contains now:
ls
all.go  all.ko  com2up.id  gene_ontology_ext.obo  go.def  go.rel  ko  ko.def  ko.rel
lets take a look at these .def and .rel files:
$ head go.def
GO:0000001      mitochondrion inheritance
GO:0000002      mitochondrial genome maintenance
GO:0000003      reproduction
GO:0000005      ribosomal chaperone activity
GO:0000006      high affinity zinc uptake transmembrane transporter activity
GO:0000007      low-affinity zinc ion transmembrane transporter activity
GO:0000008      thioredoxin
GO:0000009      alpha-1,6-mannosyltransferase activity
GO:0000010      trans-hexaprenyltranstransferase activity
GO:0000011      vacuole inheritance
$ head go.rel
GO:0000001      is_a    GO:0048308
GO:0000001      is_a    GO:0048311
GO:0000002      is_a    GO:0007005
GO:0019952      alt_id  GO:0000003
GO:0050876      alt_id  GO:0000003
GO:0000003      is_a    GO:0008150
GO:0000006      is_a    GO:0005385
GO:0000007      is_a    GO:0005385
GO:0000013      alt_id  GO:0000008
GO:0000009      is_a    GO:0000030
$ head ko.def
4       Cellular Processes
5       Human Diseases
6       Genetic Information Processing
3       Environmental Information Processing
2       Organismal Systems
1       Metabolism
1.14    Nucleotide Metabolism
1.6     Biosynthesis of Other Secondary Metabolites
2.10    Excretory System
6.19    Transcription
$ head ko.rel
1.4     is_a    1
1.5     is_a    1
1.6     is_a    1
1.7     is_a    1
1.1     is_a    1
1.2     is_a    1
1.3     is_a    1
3.11    is_a    3
1.8     is_a    1
4.13    is_a    4
at this, we finished all preparation, then we can run the analysis:
the GO level:
$ Hoea -d go.def -r go.rel -i com2up.id -a all.go
read in ho relationship file... done
read in ho definition file... done
read in background ho annotations... done
generate ho term to gene id mapping,this may take a little while...
done
start analysis...
input 3932 items for analysis, 946 items with annotations
20523 items have annotations in background
All Done!
the KO level:
$ Hoea -d ko.def -r ko.rel -i com2up.id -a all.ko
read in ho relationship file... done
read in ho definition file... done
read in background ho annotations... done
generate ho term to gene id mapping,this may take a little while...
done
start analysis...
input 3932 items for analysis, 2503 items with annotations
55195 items have annotations in background
All Done!
finally, see what contains in current directory now:
$ ls
all.go              com2up.id_GO_2.png  com2up.id_KO_2.png  com2up.id_KO_5.png    com2up.id_Routput.txt  ko.rel
all.ko              com2up.id_GO_3.dot  com2up.id_KO_3.dot  com2up.id_KO_6.dot    gene_ontology_ext.obo
com2up.id           com2up.id_GO_3.png  com2up.id_KO_3.png  com2up.id_KO_6.png    go.def
com2up.id_GO_1.dot  com2up.id_KO_1.dot  com2up.id_KO_4.dot  com2up.id_Rcom.R      go.rel
com2up.id_GO_1.png  com2up.id_KO_1.png  com2up.id_KO_4.png  com2up.id_Rcom.Rout   ko
com2up.id_GO_2.dot  com2up.id_KO_2.dot  com2up.id_KO_5.dot  com2up.id_Rinput.txt  ko.def
Really all done, we finished this analysis:-)