KAIKOGAAS is an integrated database and automated annotation system designed specifically for analysis of the silkworm genome (Bombyx mori), also known in Japanese language as "KAIKO", genome. The database currently contains the annotation of scaffolds derived from the whole genome sequencing (WGS) efforts of the Silkworm Genome Research Program (SGP) in Japan and the Silkworm Genome Project in China. The WGS reads independently obtained by the two groups were merged thereby generating a genome assembly with about 91% genome coverage. In addition, the database also includes the annotation of BAC clones that were used to check the quality of the WGS assembly.

All annotation data were generated by an automated annotation tool exclusively developed for functional and structural analysis of the silkworm genome. This annotation system includes coding region prediction programs ( GENSCAN, FGENESH, MZEF ), splice site prediction programs ( SplicePredictor ), DNA Sequence homology search analysis programs ( Blast, HMMER, ProfileScan, MOTIF ), tRNA gene prediction program ( tRNAscan-SE ), repetitive DNA analysis programs ( RepeatMasker, Printrepeats ), protein localization site prediction program ( PSORT ), and program of classification and membrane protein classification and secondary structure prediction program ( SOSUI ). Furthermore, a unique function is automatically assigned for predicted gene by GFSelectorK based on the protein homology of the gene.

Annotation of Silkworm Genome Scaffolds
A total of 192 scaffolds anchored to the 28 silkworm chromosomes (total length - 417.7 Mb). The annotation scaffolds for each chromosome can be browsed from the chromosome link below. Additionally, the chr_Un link includes the anntation of 4,615 scaffolds which are more than 1kb in length but with unknown position in the chromosomes.

The scaffold segment ID provides link to the annotation map. The position of the scaffold in the genome can be accessed via silkworm genome database, KAIKObase.

Annotation of Silkworm BAC clones
The annotated BAC sequences include 55 BAC clones derived from genomic libraries for P50T strain (mixed sex) and C108 strain (mixed sex). The genes predicted by the system are shown in the "AutoPredgeneset" row on the annotation map. Manually curated annotations are provided for two BAC clones, namely, 12L3 and 4L14.

