# README
GoAI Transcriber
GoAI Transcriber is a tool that uses OpenAI's Whisper model to transcribe audio files. It supports various audio formats including .m4a
and converts them to .mp3
before processing if necessary.
Features
- Transcribes audio files using OpenAI's Whisper model.
- Supports the following audio formats:
.mp3
,.mp4
,.mpeg
,.mpga
,.m4a
,.wav
,.webm
. - Automatically converts
.m4a
files to.mp3
for processing. - Provides a REST API for uploading and transcribing audio files.
- Includes Swagger documentation for easy API exploration.
Project Structure
.
├── cmd
│ ├── api
│ │ └── main.go
│ └── app
│ └── main.go
├── deployment
│ ├── Dockerfile
│ ├── Dockerfile.api
│ ├── docker-compose.yml
│ ├── docker-compose.api.yml
│ ├── terraform
│ │ ├── main.tf
│ │ └── variables.tf
├── internal
│ ├── api
│ │ └── api.go
│ ├── app
│ │ ├── app.go
│ │ └── functions.go
│ ├── controller
│ │ └── transcription.go
│ ├── entity
│ │ └── transcription.go
│ ├── repository
│ │ └── transcription.go
│ └── usecase
│ └── transcription.go
├── pkg
│ └── openai
│ └── api.go
├── docs
│ ├── docs.go
│ ├── swagger.json
│ └── swagger.yaml
├── go.mod
├── go.sum
├── LICENSE
├── Makefile
├── README.md
└── .env
Installation
-
Clone the repository:
git clone https://github.com/umarquez/goai_transcriber.git cd goai_transcriber
-
Set up environment variables: Create a
.env
file and add your OpenAI token:APP_OPENAI_TOKEN=your_openai_token APP_WORKING_PATH=./audios
Running the Application
Running the Application Locally
Standard Application
-
Build and run the application:
go build -o bin/app cmd/app/main.go ./bin/app
-
Transcribe audio files:
- The application reads the content of the
./audios
directory. - If there are
.m4a
files, they are converted to.mp3
due to an error from the OpenAI API processing.m4a
files. - The transcription result is written to the same directory with a
.txt
extension.
- The application reads the content of the
API Version
-
Generate Swagger documentation:
swag init --parseDependency --parseInternal -g cmd/api/main.go -o ./docs
-
Build and run the API:
go build -o bin/api cmd/api/main.go ./bin/api
-
Transcribe audio files via API: Use a tool like
curl
or Postman to send aPOST
request to the/transcribe
endpoint with your audio file.Example
curl
command:curl -X POST "http://localhost:8080/transcribe" -H "accept: application/json" -H "Content-Type: multipart/form-data" -F "file=@path/to/your/audiofile.m4a"
Running the Application Using Docker
Standard Application
-
Build and run the application using Docker:
docker-compose -f deployment/docker-compose.yml up --build
-
Transcribe audio files: Place your audio files in the
audios
directory and the application will automatically process and transcribe them.
API Version
-
Build and run the API using Docker:
docker-compose -f deployment/docker-compose.api.yml up --build
-
Transcribe audio files via API: Use a tool like
curl
or Postman to send aPOST
request to the/transcribe
endpoint with your audio file.Example
curl
command:curl -X POST "http://localhost:8080/transcribe" -H "accept: application/json" -H "Content-Type: multipart/form-data" -F "file=@path/to/your/audiofile.m4a"
API Documentation
The API is documented using Swagger. Once the application is running, you can access the documentation at:
http://localhost:8080/swagger/index.html
Contributing
Contributions are welcome! Please open an issue or submit a pull request.
License
This project is licensed under the MIT License. See the LICENSE file for details.