Share feedback, ideas and get community help

Updated 2 days ago

Open API audio transcription not working

Typebot Open API transcription service seems to be not working. I tried and the model always hallucinate. So I downloaded the audio file from S3 and tried it on my postman and showing same result "MBC 뉴스 이덕영입니다."
Attachment
image.png
1
v
B
A
24 comments
@Baptiste I know you’re super busy with queries, but really appreciate any help you can give in this
Well it's not from Typebot's side then
but maybe OpenAI is not performant enough?
Since you tried without Typebot
I tried with Typebot and it showed me same output as well. So I downloaded audio file from s3 and then used postman to invoke openAI transcription to test
and it works when I convert .mp4 to .mp3 and then send it to openAI
I don't understand what you are showing in the screenshot then?
I can't read teh response body.
I downloaded the audio mp4 file from the S3 and then used postman to invoke the whisper api this is the request and response
Attachment
image.png
then I converted same file from mp4 to mp3 locally and tried it again
Attachment
image.png
and it worked
I used the "Audio GPT" template from typebot and seems the same issue there in transcribing audio
So you are saying that the transcription is working but is not accurate and if you convert the audio file to mp3 it is far more accurate? How can I try this? Do you have an example in English?
Yes, if you try this mp4, which I recorded and downloaded from typebot s3 bucket.
Then you are talking about an issue on OpenAI's end... The audio recording is clear to me
Yes the audio is clear, but when you run this to openAI transcribe its not transcribing correctly and hallucinates. Could be the mp4 format?, as I see in couple of OpenAI threads
Like I said, then it is an issue from OpenAI’s side 🙏
hey guys, is there a way for the typebot to understand audio from the sender? and answer accordingly?
i know the bot can send the audio, but when it receives, it says it doesnt understand
I'm using the assistant from OpenAI ChatGPT 4ºmini
Do I need the regular one or is a configuration on typebot?
Hey guys, I faced the same issue where I received strange responses. The problem lies in the file format. OpenAI seems to have issues processing .mp4 files. If you use iOS devices, you’re out of luck because all messages are provided as .mp4. However, Android works fine because files are provided as .webm. I had to create an automation to convert the audio and feed it back into Typebot to solve this.
Of course, the file format itself is not a Typebot issue, but it’s a headache if you can’t just use audio recording and transcription within Typebot itself.
Unfortunately, I also noticed that there are some issues with audio recording in Typebot depending on the browser. Here’s what I have experienced so far (I could not test on Windows).

OSX:
  • ❌ Firefox: Recording does not start at button click
  • ✅ Chrome: Can send audio messages and audio file format "webm" is being accepted by OpenAI
  • ❌ Safari: Recording starts, but can’t send audio recording (keeps recording when clicking send button)
iOS:
  • ❌ Firefox: Can send audio messages but audio format "mp4" is not being accepted by OpenAI
  • ❌ Chrome: Can send audio messages but audio format "mp4" is not being accepted by OpenAI
  • ❌ Safari: Can send audio messages but audio format "mp4" is not being accepted by OpenAI
Android:
  • ❌ Firefox: Recording starts, but can’t send audio recording (keeps recording when clicking send button)
  • ✅ Chrome: Can send audio messages, format web is being accepted by OpenAI
Appreciate the testing! Let me create a GitHub issue and I will see what I can do
Add a reply
Sign up and join the conversation on Discord