Build A Personal BioDatabase with Mysql
Build A Personal BioDatabase with Mysql
This is my first time to try build a Database on my own.
I’m recording my experience and the structure of this database here in order to prevent me from forgetting how to retrieve or update the data in it.
Before doing this, I spend a few hours to taught myself about Mysql and recorded all my notes at blog
Please feel free to check the Mysql grammar before running codes below if you are an newbies like me.
I’ll start to do this with few questions:
- Why I’d like to build a database
- What I database I’d like to have
- Which features does this DB will have
Why and What
Generally, I am writing a paper about the diet-switch experiment which has a few RNA-seq data-set.
It could give me a great help if there is a database which targeted to the diet-gene interaction.
But, I can’t find a one.
Though, I know that gene regulated by nutrition effect could mostly contributed to of the effects like immune system depress, stress response, etc. But it do help researches focusing the interactions they may interested rather than searching blindly among more than 20 thousands of genes from RNA-seq or other high-throughput results.
So, I decided to write a database which target to the how the genes responded to nutrition effects.
Due to the limited of the experience and capability, I’ll store the well verified and reliable data, WB and qPCR for instance, only and maybe I’ll considerate the NGS data in the feature. It could be very fun to deal with large quantity of data, you know, as a user, definitely not a manager.
Which
At present, I think there are few separate tables I’d like to have
- Species (KEGG Organisms: has (I just like this elegant abstract))
- Species Taxonomy (A complex-version like NCBI taxonomy or sample version like KEGG Organisms?)
- Gene Annotation (I’ll try, at least.)
- Other IDs
- GO entry
- Kegg ID
- Kegg map
- Food entities (not too much experience about that. But the name will basically based on the paper)
- Regulation Net
Gene Annotation table
This is one of the most important table since the clarity of entities is pretty important for gene-related database since a common gene could have to many names.
Gene Annotation
At present, I’d like the gene has the entities like below: (mostly contributed by uniprot database)
- UniProtID: P04637
- UniprotName: P53_HUMAN
- Gene name: P53
- Organism: Homo sapiens (Human)
- Function: Acts as a tumor suppressor in many tumor type…
- GO-Molecular-function: ATP binding;chaperone binding…
- GO-Biological-process: autophagy;B cell lineage commitment
- KEGG: hsa7157 (https://www.genome.jp/dbget-bin/www_bget?hsa:7157)
- KEGG maps: hsa01522;hsa01524;hsa04010…
Diet protocol
- Diet id
- Diet name
- Diet Protocol
- Citation
Regulation
- Diet id
- Diet_secular (Soy protein)
- Diet_specific ()
- Diet_SN ()
- Specious
- Specious_muta
- Genes
- Ratio (up/down)
- Experiment (WB/qPCR/NGS)
Codes
Create DB
|
INPUT and OUTPUT
Input
|
output
|
Build A Personal BioDatabase with Mysql
https://karobben.github.io/2021/01/31/Bioinfor/food_gene_db/