Hey @Chris Krecicki, I'm stuck on calls right now, but I'm curious about your thoughts on what it would take to run transcription on all of all call recording repository. It seems like you might already have the keys to make that run.
If we can run and reposit all of it, we would feed it to SimpleTalk and build out our solution for call IVR and see what might be possible in terms of screening intakes and gathering info pre-agent interaction.
I'll have some more time after 12:30 to chat if that works.
oh man, that would take A LONG time and would cost a fortune lol if you want I can get numbers and figures on it
as for simple talk. i'd have to look into it
i imagine we'd upload the transcribed files that are returned in json to postgres (what we use for a database)
yeah, James and I are thinking the same thing in terms of the transcription effort. Joe has us locked into SimpleTalk to have them take the info and tailor our approach from there with the call center, but getting the calls transcribed to feed the model is a beast.
i could do it easily but the time and cost would be nuts
If it isn't a huge lift to get an idea of what the cost of transcription would be, that would be awesome
yeah let me get an idea give me a few ill strip some scripts apart and make something real quick to evaluate cost
high level cost is a good starting point, then if we need to get into specific campaigns and cut back on the segments of recordings to transcribe, it would make more sense I guess
yeah id have to see if it is possible to break it down like that because its all just stored in google cloud buckets in a flat way (no directories) in this format BeBe Dailey-2025-06-30-22-37-41 3463971724_transcription.json
thats how the audio files are stored currently
getting total files and total time of the files
theyre the best around, we use their best model
they return it all in structured json too,
shouldnt actually be too much -- it'll just take a long time
im at 72,000 files counted so far and it shows 359 hours
may take 3-5 days or less depends on the speed on their end
ill update you in a few when this script finished counting
the bucket were counting from is sl-five9-recordings