Audio data is becoming increasingly important in modern data processing and multimodal applications. Supametas.AI offers an intuitive and efficient local audio import feature to help users easily integrate audio data into datasets for subsequent processing. This article will guide you step by step through the entire process from task creation to data import.
1. Create a New Task
First, in the dataset detail page, select the "Local Audio Import" option from the "Import Data Source" menu, then click the "New Task" button.
- Task Naming: Enter a task name with no more than 20 characters, which will help you quickly identify and manage the task in the task list.
2. Upload Local Audio Files
Once the task is named, proceed to the audio file upload stage:
- Upload Methods:
- You can drag and drop the audio files into the upload area, or click the upload button to select local audio files.
- Supported File Formats:
- The platform supports
.mp3
and.WAV
audio formats.
- The platform supports
- File Limits:
- A maximum of 50 files can be uploaded per task;
- Each file size must not exceed 200MB.
- Tip:
- Ensure that the files uploaded within the same task have similar content to improve the accuracy of parameter retrieval and output processing.
3. Task Settings
The task settings stage is similar to other import tasks, with the main goal of ensuring the system can correctly parse and process the uploaded audio data:
- Choose the appropriate parsing method based on the audio file type.
- Configure necessary field information to ensure the data can be accurately extracted.
4. Retrieve Parameters
In this stage, you need to configure how the system will extract key information from the audio content. Common default fields include:
- Timeline: The system will attempt to extract the time information of each segment from the audio content.
- Text Details: Using speech recognition technology, the system will extract the dialogue or descriptive text from the audio.
- Text Language: The system will also detect and record the language type in the audio.
If you need to classify specific data, you can enable custom fields:
- When adding custom fields, use English for the field names and provide detailed descriptions to improve extraction accuracy.
5. Output Settings
After configuring the parameter retrieval, the next step is to set the output method to determine how the extracted data will be saved and exported:
- Output Format Selection:
- JSON Format: Suitable for subsequent API program calls and processing.
- Markdown Format: More beneficial for building knowledge bases and document presentations.
6. Save or Execute the Task Immediately
Finally, based on your needs, you can choose the task execution method:
- Save and Execute Later:
- Save the task to the task list for manual execution later.
- Execute Task Immediately:
- If the configuration is correct and you are ready, click the "Execute Task Now" button, and the system will start processing the uploaded audio files and import the extracted data into the specified dataset.
With intuitive task creation, file upload, parameter retrieval, and output setting processes, users can efficiently integrate and process audio data, laying a solid foundation for multimodal data processing and intelligent applications.