Below is a list of complementary and largely independent project goals.
My intuition is that we should tackle these as sequentially as possible since they'll be additive when completed in succession. However, if there's a particular goal or area of the project you want to start working on right away please feel free to let the rest of us know!
Main Course
Create a better chatbot than Canna-GPT by using a superior fine-tuning dataset & model configuration.
Ingest relevant computational biology tooling codebases (e.g. BioPython) to enable a chatbot capable of contextual question answering and code generation in GitHub repositories.
Create chains/agents capable of accessing relevant APIs similar to Gene-GPT then extracting, processing, and interpreting data.
Create chains/agents capable of using pre-trained models for computational biology and provide them with access to cannabis data.
Dessert
Benchmark capabilities and limitations of alternative systems (Canna-GPT, GPT-4, Bio-GPT) for our desired goal(s). Example: Canna-GPT doesn't have memory.
Prepare some really awesome fine-tuning data.
Combine all the cool things and all the cool data.