Documentation
Here we present detail instructions of using Hoea as an installed python module.
1, when you download the source tarbar, decompress it, and changed into the newly created Hoea-version directory.
2, run following command to install the module. (make sure you have enough priviledge if you want to install it into a system directory, we assume you had here).( $ as the shell command prompt, same as below).
$ sudo python setup.py install running install running build running build_py creating build creating build/lib creating build/lib/Hoea copying src/Hoea/KOIterator.py -> build/lib/Hoea copying src/Hoea/GOIterator.py -> build/lib/Hoea copying src/Hoea/Hoea.py -> build/lib/Hoea copying src/Hoea/__init__.py -> build/lib/Hoea creating build/lib/Hoea/data copying src/Hoea/data/ko.rel -> build/lib/Hoea/data copying src/Hoea/data/ko.def -> build/lib/Hoea/data copying src/Hoea/data/go.rel -> build/lib/Hoea/data copying src/Hoea/data/go.def -> build/lib/Hoea/data creating build/lib/Hoea/sample copying src/Hoea/sample/mt_probe_go.txt -> build/lib/Hoea/sample copying src/Hoea/sample/probe_6h_up.txt -> build/lib/Hoea/sample running build_scripts creating build/scripts-2.4 copying and adjusting Hoea -> build/scripts-2.4 copying and adjusting GOIterator -> build/scripts-2.4 copying and adjusting KOIterator -> build/scripts-2.4 changing mode of build/scripts-2.4/Hoea from 644 to 755 changing mode of build/scripts-2.4/GOIterator from 644 to 755 changing mode of build/scripts-2.4/KOIterator from 644 to 755 running install_lib creating /usr/local/lib/python2.4/site-packages/Hoea copying build/lib/Hoea/KOIterator.py -> /usr/local/lib/python2.4/site-packages/Hoea copying build/lib/Hoea/GOIterator.py -> /usr/local/lib/python2.4/site-packages/Hoea copying build/lib/Hoea/Hoea.py -> /usr/local/lib/python2.4/site-packages/Hoea copying build/lib/Hoea/__init__.py -> /usr/local/lib/python2.4/site-packages/Hoea creating /usr/local/lib/python2.4/site-packages/Hoea/sample copying build/lib/Hoea/sample/mt_probe_go.txt -> /usr/local/lib/python2.4/site-packages/Hoea/sample copying build/lib/Hoea/sample/probe_6h_up.txt -> /usr/local/lib/python2.4/site-packages/Hoea/sample creating /usr/local/lib/python2.4/site-packages/Hoea/data copying build/lib/Hoea/data/ko.rel -> /usr/local/lib/python2.4/site-packages/Hoea/data copying build/lib/Hoea/data/ko.def -> /usr/local/lib/python2.4/site-packages/Hoea/data copying build/lib/Hoea/data/go.rel -> /usr/local/lib/python2.4/site-packages/Hoea/data copying build/lib/Hoea/data/go.def -> /usr/local/lib/python2.4/site-packages/Hoea/data byte-compiling /usr/local/lib/python2.4/site-packages/Hoea/KOIterator.py to KOIterator.pyc byte-compiling /usr/local/lib/python2.4/site-packages/Hoea/GOIterator.py to GOIterator.pyc byte-compiling /usr/local/lib/python2.4/site-packages/Hoea/Hoea.py to Hoea.pyc byte-compiling /usr/local/lib/python2.4/site-packages/Hoea/__init__.py to __init__.pyc running install_scripts copying build/scripts-2.4/KOIterator -> /usr/local/bin copying build/scripts-2.4/Hoea -> /usr/local/bin copying build/scripts-2.4/GOIterator -> /usr/local/bin changing mode of /usr/local/bin/KOIterator to 755 changing mode of /usr/local/bin/Hoea to 755 changing mode of /usr/local/bin/GOIterator to 755if no error occurs, then you have installed Hoea sucessfully, congratulate!
3, after installation, three command were available as shown on Tutorial, the new three commands are: Hoea, GOIterator and KOIterator, run the command without any options will give the help message like:
$ GOIterator usage: GOIterator -g gene_ontology_ext.obo -r go.relation -d go.definition GOIterator: error: must specify the GO file,like gene_ontology_ext.obo,use -h to see parametersthere commands could be used same as the scripts where the Tutorial had shown.
4, the GOIterator module
the GOIterator module is actually used to parse the gene_ontology*.obo file. so firsr we download the gene_ontology_ext.obo file from: ftp://ftp.geneontology.org/go/ontology/obo_format_1_2/gene_ontology_ext.obo. Lets resume we had downloaded the newest gene_ontology_ext.obo file and placed it in the current directory, and we would open a Python interpreter to parse this file.
$ python Python 2.4.4 (#1, Mar 11 2008, 23:13:49) [GCC 3.4.6 20060404 (Red Hat 3.4.6-8)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from Hoea import GOIterator as GI >>> gis = GI.GOIterator('gene_ontology_ext.obo') >>> gos = gis.parse() >>> firstGO = gos.next() >>> firstGO.id 'GO:0000001'the firstGO is a GORecord object contains owns many attributes, like id means this GO term's id. many other attributes are available:
>>> firstGO.name 'mitochondrion inheritance'
>>> firstGO.is_a ['GO:0048308 ! organelle inheritance', 'GO:0048311 ! mitochondrion distribution']the is_a attributes indicates this GO term is_a GO:0048308 and GO:0048311, in Hoea we named GO:0048308 and GO:0048311 are parents term of GO:0000001.
>>> firstGO.parents [<Hoea.GOIterator.GORelationship instance at 0xb7f68dec>, <Hoea.GOIterator.GORelationship instance at 0xb7f68e0c>] >>> firstGO.parents[0].id 'GO:0048308' >>> firstGO.parents[0].relationship 'is_a' >>> firstGO.parents[0].name 'organelle inheritance'the parents attribute returns a list of parents of this term. each member of this list is a GORelationship object which owns id, name and relatioship attributes.
>>> secondGO = gos.next() >>> secondGO.id 'GO:0000002' >>> secondGO.name,secondGO.namespace ('mitochondrial genome maintenance', 'biological_process')for example if you would like to generate the GO definition file, you can use a for loop similar like this:
>>> for i in gos: ... print i.id,i.nameNote: firstGO.info will print a dictionary tells all attributes of this GO term.
5, the KOIterator module
the KOIterator module works similar like GOIterator module, while every KO record would have diffenent properties. we get the ko file from ftp://ftp.genome.jp/pub/kegg/genes/ko and put it at current directory.
$ python Python 2.4.4 (#1, Mar 11 2008, 23:13:49) [GCC 3.4.6 20060404 (Red Hat 3.4.6-8)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from Hoea import KOIterator as KI >>> kis = KI.KOIterator('ko') >>> kos = kis.parse() >>> firstKO = kos.next() >>> firstKO.entry_id 'K00001' >>> firstKO.definition 'alcohol dehydrogenase [EC:1.1.1.1]' >>> firstKO.pathway ['ko00010 Glycolysis / Gluconeogenesis', 'ko00071 Fatty acid metabolism ko00071 Fatty acid metabolism', 'ko00350 Tyrosine metabolism ko00350 Tyrosine metabolism', 'ko00625 Chloroalkane and chloroalkene degradation ko00625 Chloroalkane and chloroalkene degradation', 'ko00626 Naphthalene degradation ko00626 Naphthalene degradation', 'ko00830 Retinol metabolism ko00830 Retinol metabolism', 'ko00980 Metabolism of xenobiotics by cytochrome P450 ko00980 Metabolism of xenobiotics by cytochrome P450', 'ko00982 Drug metabolism - cytochrome P450 ko00982 Drug metabolism - cytochrome P450'] >>>pathway attribute is a list of pathways this KO term involved. Note: firstKO.d will print a dictionary tells all attributes of this KO term.
Let's start a full analysis with Hoea from the beginning
Before start, we assume that you have installed Hoea successfully. as a repeat, we just provide some command example shows how to install Hoea (comments are after the # symbol)
$ wget http://sourceforge.net/projects/hoea/files/0.2/Hoea-0.2.tar.gz/download #download $ tar zxvf Hoea-0.2.tar.gz #decompress $ cd Hoea-0.2 #change directory $ python setup.py install #installation process, might need some priviledgesLets start this journey. Files we already have currently:
$ ls all.go all.ko com2up.idfirst 10 lines of this file looks like as:
$ head all.ko 1_scaffold_10016-1 K12867 1_scaffold_19693-0 K08803 1_scaffold_4523-0 K02149 1_scaffold_25008-0 K10257 1_scaffold_22510-0 K04077 1_scaffold_17237-0 K03063 1_scaffold_5829-1 K11975 1_scaffold_13429-0 K10712 1_scaffold_4963-0 K01714 1_scaffold_20563-0 K13468
$ head all.go 1_scaffold_4605-0 0006915 1_scaffold_35814-0 0006418 1_scaffold_35814-0 0006412 3_scaffold_17977-0 0008152 3_scaffold_17578-0 0006468 3_scaffold_17578-0 0006468 2_scaffold_9265-0 0007165 2_scaffold_9265-0 0045087 3_scaffold_7175-0 0009416 4_scaffold_3554-0 0030001
$ head com2up.id 3_scaffold_7080-0 3_scaffold_43440-0 2_contig35846 1_scaffold_6859-1 1_scaffold_32033-0 1_scaffold_8190-0 3_scaffold_10050-0 3_scaffold_5957-0 2_scaffold_4660-0 3_scaffold_13583-0the com2up.id file is the input file for enrichment analysis, the all.ko and all.go is the KO and GO annotation files, respectively.
Now, we are about to generate KO and GO relationship file and definition file, first download the frequently-updated ho file:
$ wget -b ftp://ftp.geneontology.org/go/ontology/obo_format_1_2/gene_ontology_ext.obo $ wget -b ftp://ftp.genome.jp/pub/kegg/genes/korun command GOIterator and KOIterator as below to generate files:
$ GOIterator -g gene_ontology_ext.obo -r go.rel -d go.def $ KOIterator -k ko -r ko.rel -d ko.defsee what the directory contains now:
ls all.go all.ko com2up.id gene_ontology_ext.obo go.def go.rel ko ko.def ko.rellets take a look at these .def and .rel files:
$ head go.def GO:0000001 mitochondrion inheritance GO:0000002 mitochondrial genome maintenance GO:0000003 reproduction GO:0000005 ribosomal chaperone activity GO:0000006 high affinity zinc uptake transmembrane transporter activity GO:0000007 low-affinity zinc ion transmembrane transporter activity GO:0000008 thioredoxin GO:0000009 alpha-1,6-mannosyltransferase activity GO:0000010 trans-hexaprenyltranstransferase activity GO:0000011 vacuole inheritance
$ head go.rel GO:0000001 is_a GO:0048308 GO:0000001 is_a GO:0048311 GO:0000002 is_a GO:0007005 GO:0019952 alt_id GO:0000003 GO:0050876 alt_id GO:0000003 GO:0000003 is_a GO:0008150 GO:0000006 is_a GO:0005385 GO:0000007 is_a GO:0005385 GO:0000013 alt_id GO:0000008 GO:0000009 is_a GO:0000030
$ head ko.def 4 Cellular Processes 5 Human Diseases 6 Genetic Information Processing 3 Environmental Information Processing 2 Organismal Systems 1 Metabolism 1.14 Nucleotide Metabolism 1.6 Biosynthesis of Other Secondary Metabolites 2.10 Excretory System 6.19 Transcription
$ head ko.rel 1.4 is_a 1 1.5 is_a 1 1.6 is_a 1 1.7 is_a 1 1.1 is_a 1 1.2 is_a 1 1.3 is_a 1 3.11 is_a 3 1.8 is_a 1 4.13 is_a 4at this, we finished all preparation, then we can run the analysis:
the GO level:
$ Hoea -d go.def -r go.rel -i com2up.id -a all.go read in ho relationship file... done read in ho definition file... done read in background ho annotations... done generate ho term to gene id mapping,this may take a little while... done start analysis... input 3932 items for analysis, 946 items with annotations 20523 items have annotations in background All Done!the KO level:
$ Hoea -d ko.def -r ko.rel -i com2up.id -a all.ko read in ho relationship file... done read in ho definition file... done read in background ho annotations... done generate ho term to gene id mapping,this may take a little while... done start analysis... input 3932 items for analysis, 2503 items with annotations 55195 items have annotations in background All Done!finally, see what contains in current directory now:
$ ls all.go com2up.id_GO_2.png com2up.id_KO_2.png com2up.id_KO_5.png com2up.id_Routput.txt ko.rel all.ko com2up.id_GO_3.dot com2up.id_KO_3.dot com2up.id_KO_6.dot gene_ontology_ext.obo com2up.id com2up.id_GO_3.png com2up.id_KO_3.png com2up.id_KO_6.png go.def com2up.id_GO_1.dot com2up.id_KO_1.dot com2up.id_KO_4.dot com2up.id_Rcom.R go.rel com2up.id_GO_1.png com2up.id_KO_1.png com2up.id_KO_4.png com2up.id_Rcom.Rout ko com2up.id_GO_2.dot com2up.id_KO_2.dot com2up.id_KO_5.dot com2up.id_Rinput.txt ko.defReally all done, we finished this analysis:-)