The Grand Ol' Database by MrPanda - open to OG

This must be pre cookies influx.

2 Likes

It would be fun to also build a table of parents for each strain. Then one could potentially find parents that tend to have certain characteristics in the offspring.

4 Likes

What program are you using to run that?
Keep it coming!
Ronzo

1 Like

That is a tool for accessing SQL Server data, called SSMS. I will be building a couple basic web pages where everyone can run some basic queries like that one.

8 Likes

Thank you, I appreciate it.
Ronzo

Yeah, I started collecting the data when I was dead set of starting a big genetics company, but you know, I dream bigger than my pocket book, so after 5-6 years with this list, Iā€™m sharing it.

My initial goal was to breed for things like super high THCV, find THCa strains that fell under delta-9 regs here in NY, and building the most complete set of terps in a staple cultivar.

Now, I realize that requires many growers, and give them a tool and we are breed more intentionally. Yes, data is from older databases, pre-cookies, pre-inflated test results, and should be relatively honest data.

9 Likes

The data started very nasty, with concentrate tests ruining the flower test calculations, and weird stuff, and that took a lot to resolve, removing around 8k results in that process. When looking at it, you may see pheno #ā€™s and selections, which is cool.

I tried removing tests for trim, prerolls, hash, wax, crumble, etc.

There may be some things still left for improvement, but I find it useful when asking queries to an AI trained on the data

5 Likes

Looking at this screenshot, THC calc may need to be shifted by a decimal, maybe highest THC was actually 29% at that time, not 2.9%

3 Likes

I will go ahead and shift the decimal.over one on the thc scores, that does sound right

Yeah, I have been referencing the delta 9 thc column as being more accurate for that. I did a query for any results over 40%, thinking anything above that would be left over concentrate results.

So, my AI queries are something like this:

Give me the top 10 samples that are high in both delta-9 thc, Ocimene, and Limoneneā€¦
-or-
What 20 samples are high in delta-9 thc, and have the largest number of total compoundsā€¦
-or-
What 2 samples when combined give the largest spread of unique compoundsā€¦

I ended up downloading Zing Data, and it has a mobile app, and can upload the dataset via G-sheet link or CSV, both work. After installing the app, in the free account, you can run all the queries using AI natural language (no SQL knowledge needed)

4 Likes

Here we go!

I added the DB to an AI Assistant, and embedded to a webpage to be tested out by the community.

https://breedz.co/

The goal is to stage improvements of data, add recent data, and make the tool more robust.

Thanks everyone for the support, letā€™s grow this and make a useful tool.

7 Likes

Thank you for setting this up.

Interesting that Hawaiian Snow has 9.67% CBGA

image

I was not expecting that.

4 Likes

Awesome, this works well. And it is quick as well.

of the outdoor strains you said will grow outdoors in southern Alberta, which has the shortest grow time and highest THC?
Tangie B#555
38.87

2 Likes

Thatā€™s coolā€¦

I asked it for the 'top ten most common terpenesā€™
and it chewed on it for a bit and barfed out the numbersā€¦ all neatly organized and that. was. itā€¦ okā€¦
ā€œadd the namesā€ and nowā€¦ a perfect reply

I need to beat on this some moreā€¦ :kissing_heart:

Cheers
G

4 Likes

I got the same result using your quoted stuff, itā€™s pretty cool.
Is there a way for this interface to post relevant links and maybe pictures?
Ronzo

3 Likes

best plant says - urban poison, highest terpene content = durban poison x nl

3 Likes

We can append any data, such as relevant links, etc.

I would like to have it direct to a AI search for seed banks that carry those seeds (not just sponsors or listed vendors, but anyone anywhere, small or large). Or possibly ā€œmentionsā€, since many heirlooms donā€™t exist in seed banks, can reference possibly an Instagram account that tagged it or similar. Bridging the gap on sourcing your medicine.

4 Likes

I am suspicious of that result. There are possibly still some rows that are actually concentrates, and that may be one of them. Itā€™s hard to understand what each column represents, but the delta-9 THC-A column in that row is 68.34, which would mean total thc of around 65%. Since @MrPanda filtered out concentrate rows based on delta-9 THC only, this row would not be excluded. It may be more effective to filter out rows based on total cannabinoids < 40%.

1 Like

Or perhaps those specific cannabinoid columns are in mg/g instead of %, in which case it would mean .967% CBG, which seems reasonable.

1 Like

I have a plant that has up to 18% CBGA but it was made in a University lab.