6. How to Import Local Audio Data into Supametas.AI Platform

This article provides a detailed explanation on how to use the local audio import feature of the Supametas.AI platform, covering task creation, audio upload, task settings, parameter retrieval, and output configuration, offering you a one-stop operation guide to efficiently process audio data.

Supametas · 2025-02-22

Audio data is becoming increasingly important in modern data processing and multimodal applications. Supametas.AI offers an intuitive and efficient local audio import feature to help users easily integrate audio data into datasets for subsequent processing. This article will guide you step by step through the entire process from task creation to data import.

Create a new task to import local audio for the dataset.png

1. Create a New Task

First, in the dataset detail page, select the "Local Audio Import" option from the "Import Data Source" menu, then click the "New Task" button.

Task Naming: Enter a task name with no more than 20 characters, which will help you quickly identify and manage the task in the task list.

2. Upload Local Audio Files

Once the task is named, proceed to the audio file upload stage:

Upload Methods:
- You can drag and drop the audio files into the upload area, or click the upload button to select local audio files.
Supported File Formats:
- The platform supports .mp3 and .WAV audio formats.
File Limits:
- A maximum of 50 files can be uploaded per task;
- Each file size must not exceed 200MB.
Tip:
- Ensure that the files uploaded within the same task have similar content to improve the accuracy of parameter retrieval and output processing.

3. Task Settings

The task settings stage is similar to other import tasks, with the main goal of ensuring the system can correctly parse and process the uploaded audio data:

Choose the appropriate parsing method based on the audio file type.
Configure necessary field information to ensure the data can be accurately extracted.

4. Retrieve Parameters

In this stage, you need to configure how the system will extract key information from the audio content. Common default fields include:

Timeline: The system will attempt to extract the time information of each segment from the audio content.
Text Details: Using speech recognition technology, the system will extract the dialogue or descriptive text from the audio.
Text Language: The system will also detect and record the language type in the audio.

If you need to classify specific data, you can enable custom fields:

When adding custom fields, use English for the field names and provide detailed descriptions to improve extraction accuracy.

5. Output Settings

After configuring the parameter retrieval, the next step is to set the output method to determine how the extracted data will be saved and exported:

Output Format Selection:
- JSON Format: Suitable for subsequent API program calls and processing.
- Markdown Format: More beneficial for building knowledge bases and document presentations.

6. Save or Execute the Task Immediately

Finally, based on your needs, you can choose the task execution method:

Save and Execute Later:
- Save the task to the task list for manual execution later.
Execute Task Immediately:
- If the configuration is correct and you are ready, click the "Execute Task Now" button, and the system will start processing the uploaded audio files and import the extracted data into the specified dataset.

With intuitive task creation, file upload, parameter retrieval, and output setting processes, users can efficiently integrate and process audio data, laying a solid foundation for multimodal data processing and intelligent applications.

Stop wasting time on data processing

Start your SaaS version trial, free, zero threshold, out of the box

Get Started

Private Deployment

We have already understood the data privacy needs of enterprises. In addition to the SaaS version, the Docker deployment version is also in full preparation

Coming soon..