Using a Grakn Knowledge Graph for Biological Sequence Alignment Analysis
The Code
The code you saw above is Graql. Graql is the language for Grakn — the knowledge graph. The expressivity of Graql is what makes it the most readable query language since the beginning of databases. In simple terms, Graql is a language that can be understood and written by anyone, not just programmers.
In this optimised workflow, Graql is used for analysis. The part of the code that automates the workflow consists of two files:
migrate.py
: reads a.fasta
file containing 12 proteins that relate to asthma (exported from UniProt) and inserts each protein along with its sequence into the knowledge graph.blast.py
: 1) extracts the target sequences from the knowledge graph, 2) runs a BLAST search for each sequence, and 3) imports the result from each BLAST search back into the knowledge graph.
To run this example on your local machine, follow these instructions.
Tip of the Iceberg
This example only touches the surface of how a knowledge graph can revolutionise workflows that rely on bioinformatic analytical tools, such as BLAST. The schema presented in this article can be extended and others can be modelled to represent complex
Grakn’s Knowledge Graph has the potential to complement a vast majority of Biological Research domains. Some of those fields include; analysing DNA sequence alignments, exploring genomics, drug discovery, disease networks, neuro-informatics, and even investigating protein structure, function, properties and classifications.
we want to hear from you :)
Share your unique experience in using BLAST and how you think this optimised workflow relates (or not) to the way you use BLAST. As a part of this or other workflows, what other publicly available analytical tools and programs do you normally use? How do you see a central knowledge base in form of a knowledge graph, adding value in your field?
Talk to us and discuss your ideas with the Grakn community ?
If you have any questions or comments about this work, please send me an email at [email protected] or tweet me @SaffariSoroush.