Transcriptions are in Progress, requests open!
Index
- The Pot Cast
- Youngsang Cho 2023 JADAM lectures
- Breeder’s Syndicate
- Shaping Fire
- Future Cannabis Project
- Calyx & Crew Podcast
- Restore Hemp Archive
- Cornell SIPS
- Apogee Instruments
- Grow From Your Heart Podcast
Pending
- Requests Open!
Last updated 2023-06-12
Original Post:
So this is an idea that’s been floating around my head for a while. There’s tons of breeding info floating around in video or audio content, and for the vast majority of them the authors do not have transcriptions available. This is frustrating to me for many reasons: I don’t always want to watch or listen to learn, there’s lots of non-breeding-related chatter, I can’t easily reference or share the info within without having to scrub through it again, etc etc.
Until the recent explosion of accessible deep learning AI, the options for transcribing this content were not great and usually paywalled. I’ll leave the nerdy stuff for the end of the post but I came across a pretty decent method of transcription and I’d love to share this with my fellow OGers.
So here’s the first transcript I’ve done of The Pot Cast’s Ep 50 with Ryan Lee/Chimera (pt. 1)
I think the results are good enough. Obviously it’s not perfect, but it’s far better than nothing. I’d love to know what y’all think about this. Do you see yourself using this? Which podcasts or videos would you want transcribed?
Nerdy stuff
Quick preface that I am by no means an expert on anything related to this. I just find it fascinating and have been loving tinkering with anything that leverages GPU compute. I am a total casual when it comes to python (or any programming), DL AI, and everything in between. So, like the pastebin says this was made using Whisper (specfically, whisper-standalone-win) and I was able to transcribe the 2 hour podcast in 10 minutes with an RTX 3080 using the large model - not bad at all. Plus, the workflow with the standalone app is dead simple: grab the audio/video, run the command, wait, get the output.
I think there would be some value in creating a public repository of the transcriptions, similar to how open source projects handle translations. That way the info is easily indexed, and everyone can contribute corrections or additional transcriptions. It could even just all be straight up hosted on github with a Github Page frontend for browsing it. If there’s enough demand, I would be more than glad to get that set up to host all the requests you OGers want to see.
Additionally, I also came across this project which combines Whisper and pyannote-audio to split the transcription based on speakers. Then it generates a html site with the text, that follows along matched up with the video timestamps - very cool! I am new to jupyter notebooks so I struggled a bit but managed to get it working as a regular python script in windows. The workflow is super hacky and manual, plus the speaker identification is so-so. Something like this would be the endgame for the github repo idea I mentioned above.