Supametas.AI Logo
Return to blog list
Guides

1. How to Create a Dataset in Supametas.AI Cloud Service

This article provides a detailed breakdown of the entire process for creating a dataset in Supametas.AI Cloud Service, from the basic concept of datasets, naming conventions, to API model configurations, helping you get started quickly.

Supametas's avatar
Supametas · 2025-02-22
Share to X
Share to LinkedIn
Share to Facebook
Share to Hacker News

In the process of data cleaning and processing, datasets play a crucial role. For users of Supametas.AI Cloud Service, a dataset is not only a space to store data but also the foundation for managing and calling multimodal large model APIs.

1. Dataset Overview

On the Supametas.AI platform, each dataset is an independent storage space where users can store cleaned data. Creating a dataset is the first step to using the cloud service. You can view the datasets you have created on the platform’s main page and click the “+ Click to create a dataset” button to enter the dataset creation wizard.

2. Preparations Before Creating a Dataset

Before starting to create a dataset, make sure you have prepared the multimodal large model API. Currently, the platform supports using OpenAI GPT-4 multimodal models. You can get the required API key from OpenAI API Keys. Alternatively, you can choose a third-party OpenAI service provider.

3. Detailed Steps in the Dataset Creation Wizard

Create a new dataset.png

During the dataset creation process, you will go through the following configuration steps:

3.1 Enter Dataset Name

  • Requirement: Set a name for the new dataset, with a maximum length of 20 characters.
  • Tip: The name should be concise and clear for easier management and identification.

3.2 Add Dataset Description (Optional)

  • Requirement: You can add a brief description of the dataset, with a maximum of 50 characters.
  • Suggestion: The description can include the dataset’s purpose, data source, or other relevant information.

3.3 Choose Model Type

In the “Model Settings” area, you need to decide which type of model to use:

  • Built-in System Model: Use the default model configuration provided by the platform, but its quota is limited and will stop working once used up.
  • Configure External Model: It is recommended to configure external models (e.g., OpenAI/OneAPI), giving you more flexibility in controlling usage and costs.

3.4 API Configuration (For External Models)

If you choose to use an external model, you need to configure the following information:

  • API Key: Enter the API key you obtained.
  • BaseUrl: Fill in the base URL of the API.
  • Channel Selection: Choose the corresponding API channel from the dropdown list, such as OpenAI.
  • Model Selection: Choose the specific model version you need, such as gpt-4-turbo.

3.5 Save Configuration and Create Dataset

  • Action: After completing the above settings, click the “Next” button at the bottom of the page to save.
  • Verification: The system will automatically check if the API configuration you entered is correct. If there are errors, it will prompt you to re-enter; once correct, the dataset will be created automatically.

Tip: The dataset will use the API model you configured. Each time you create or modify a dataset, you can specify a different API model to better control usage and costs. For the built-in system models, since the quota is limited, it is recommended to use your own API whenever possible. If you have special requirements, you can also contact Supametas.AI by email to customize your quota (Email: [email protected]).

Creating a dataset is the first step in using Supametas.AI Cloud Service and serves as the foundation for subsequent operations such as data cleaning and metadata importing. Through the detailed explanation in this article, I hope it will help you successfully create a dataset and gain a clearer understanding of key steps like API configuration. Whether you're a beginner or an experienced user, you can quickly get started and fully leverage the multimodal large model services provided by Supametas.AI.

Stop wasting time on data processing

Start your SaaS version trial, free, zero threshold, out of the box

Stop wasting time on data processing
Start your SaaS version trial, free, zero threshold, out of the box
Get Started

Private Deployment

We have already understood the data privacy needs of enterprises. In addition to the SaaS version, the Docker deployment version is also in full preparation

Private Deployment
We have already understood the data privacy needs of enterprises. In addition to the SaaS version, the Docker deployment version is also in full preparation
Coming soon..
Supametas.AI Logo - Footer
Supametas.AI is committed to becoming the industry-leading LLM data structuring processing development platform
0
© 2025 kazudata, Inc. All rights reserved