Tensorent Documentation

Using the Tensorent Dashboard

Go to https://dashboard.tensorent.com (This is a placeholder URL)
Enter your credentials.
Access your dedicated cloud instance panel.

2. Customizing Your Instance

After logging in, you can customize your instance to meet your specific requirements. The customization options are typically presented in a form or a series of steps.

Select GPU type: Choose from the available GPU models (e.g., NVIDIA A100, RTX 4090, etc.). The available options will depend on your account and current resource availability.
Choose the required RAM and CPU cores: Specify the amount of RAM (e.g., 32GB, 64GB, 128GB) and the number of CPU cores you need. These choices will impact the processing power and memory capacity of your instance.
Configure storage options:
- Root Disk Size: Determine the size of the root disk (where the operating system and your primary data will be stored).
- Additional Volumes: Optionally, add extra storage volumes for larger datasets or specific project needs. You may be able to choose between different storage types (e.g., SSD, HDD) with varying performance characteristics.
Set up networking preferences:
- Public IP: Decide if your instance needs a public IP address for external access.
- Firewall Rules: Configure firewall rules to allow or deny specific types of network traffic to your instance. This enhances security. For example:
```
# Allow SSH connections (port 22) from a specific IP address
                        iptables -A INPUT -p tcp --dport 22 -s 192.168.1.100 -j ACCEPT

                        # Allow HTTP traffic (port 80) from anywhere
                        iptables -A INPUT -p tcp --dport 80 -j ACCEPT

                        # Drop all other incoming traffic
                        iptables -A INPUT -j DROP
                        
```
  Important: Carefully configure your firewall rules to avoid accidentally blocking legitimate traffic or exposing your instance to unnecessary risks. Consult network security best practices. The dashboard may provide a user-friendly interface for configuring firewall rules without requiring direct command-line interaction.

Once you've made your selections, you'll typically have a "Create Instance" or "Launch Instance" button to start provisioning your customized GPU instance.

3. SSH Access to Your Instance

Secure Shell (SSH) is the standard method for securely accessing your remote instance. You'll need an SSH client (like OpenSSH on Linux/macOS or PuTTY on Windows).

Download the SSH key: Download the SSH key (.pem file) provided during the account setup process. This key acts as your password. Keep this file *very* secure. Do not share it.
Locate your Instance IP: Find the public IP address of your instance in your Tensorent dashboard.
Open a terminal (Linux/macOS) or PuTTY (Windows):
- Linux/macOS: Open a terminal and use the following command, replacing placeholders with your actual values:
```
ssh -i /path/to/your/private_key.pem username@your-instance-public-ip
```
  - -i /path/to/your/private_key.pem: Specifies the path to your downloaded SSH key file.
  - username: Your username for the instance (provided by Tensorent, often "ubuntu", "centos", or a custom username).
  - your-instance-public-ip: The public IP address of your instance.
  - Example: `ssh -i ~/.ssh/tensorent_key.pem [email protected]`
  You may need to adjust the permissions of your key file for security:
```
chmod 400 /path/to/your/private_key.pem
```
  This command ensures that only you have read access to the key file, which is required by SSH.
- Windows (PuTTY):
  1. Open PuTTY.
  2. In the "Host Name (or IP address)" field, enter: `username@your-instance-public-ip`
  3. In the left-hand panel, navigate to "Connection" -> "SSH" -> "Auth".
  4. Click "Browse..." and select your downloaded .ppk file (you may need to convert your .pem file to .ppk using PuTTYgen). PuTTYgen is a separate utility included with PuTTY.
  5. Click "Open" to connect.
First-time connection: You may be prompted to accept the server's host key fingerprint. Verify this fingerprint against the one displayed in your Tensorent dashboard (if provided) to ensure you're connecting to the correct server.
You should now be logged into your cloud GPU instance's command line.

Troubleshooting SSH:

Permission denied (publickey): This usually means your SSH key is incorrect, not found, or has incorrect permissions. Double-check the path and permissions (chmod 400).
Connection refused: The instance may not be running, SSH may not be enabled, or a firewall may be blocking the connection. Check the instance status in the dashboard and verify your firewall rules.
Connection timed out: This often indicates a network connectivity problem or a firewall issue.

4. Using the Integrated Notebook

Tensorent provides a built-in notebook environment for interactive computing, similar to Jupyter Notebook or JupyterLab, but optimized for performance and accuracy.

Accessing the Notebook: You can usually access the notebook directly from your Tensorent dashboard via a "Launch Notebook" button or a provided URL. This URL might look like: `https://your-instance-ip:8888` (8888 is a common port for Jupyter).
Authentication: The notebook interface may require a token or password for security. This token is typically generated when the notebook server starts and can be found in the instance's logs or through the dashboard.
Creating Notebooks:
- Click the "New" button (or similar) in the notebook interface.
- Choose a kernel (Python 3, R, Julia, etc.) to create a new notebook of that type.
Using the Notebook:
- Cells: Notebooks are composed of cells. Cells can contain code (to be executed) or Markdown (for text, headings, and documentation).
- Executing Code: Type your code into a code cell and press Shift+Enter to execute it. The output (if any) will be displayed below the cell.
- Kernel Management: You can restart, interrupt, or shut down the kernel (the computational engine) from the notebook interface.
- Pre-installed Frameworks: Tensorent instances often come with popular deep learning frameworks (TensorFlow, PyTorch, etc.) pre-installed. You can import these libraries directly into your notebooks:
```
import tensorflow as tf
import torch
                      
```
File Management: You can upload, download, and manage files within the notebook interface. This allows you to work with datasets, models, and other files directly within your instance.
Sharing Notebooks: You can download your notebooks as `.ipynb` files (the standard Jupyter Notebook format) to share with others.

Troubleshooting Notebooks:

Notebook not loading: Check the notebook server's logs (accessible via SSH) for errors. Make sure the notebook server is running. Try restarting the server from the dashboard or via SSH.
Kernel errors: If your code produces errors, carefully review the error messages. Make sure you have the necessary libraries installed.

Hosting AI Models and Applications

1. Deploying an AI Model

This section describes how to deploy a trained AI model for inference (making predictions).

Prepare your model:

Save your model: Save your trained model in a suitable format. Common formats include:

TensorFlow: SavedModel format (`.pb`) or HDF5 (`.h5`)
PyTorch: TorchScript (`.pt`) or a state dictionary (`.pth`)
Scikit-learn: Joblib (`.joblib`) or Pickle (`.pkl`)

Example (TensorFlow):

# Save as SavedModel
tf.saved_model.save(model, "path/to/saved_model")

# Save as HDF5
model.save("path/to/model.h5")

Example (PyTorch):

# Save as TorchScript
traced_model = torch.jit.trace(model, example_input)
torch.jit.save(traced_model, "path/to/model.pt")

# Save state dictionary
torch.save(model.state_dict(), "path/to/model.pth")

Create a prediction script: Write a Python script (e.g., `predict.py`) that loads your model and handles prediction requests. This script will act as the interface between your model and incoming requests. It should:

Load the saved model.
Preprocess input data (resize images, tokenize text, etc.).
Run the model's prediction function.
Post-process the output (convert probabilities to class labels, etc.).
Return the prediction in a suitable format (e.g., JSON).

Example (TensorFlow, using Flask for a simple web server):

# predict.py
from flask import Flask, request, jsonify
import tensorflow as tf
import numpy as np

# Load the model (replace with your actual path)
model = tf.saved_model.load("path/to/saved_model")

app = Flask(__name__)

@app.route("/predict", methods=["POST"])
def predict():
    # Get data from request (assuming JSON with an 'image' key)
    data = request.get_json()
    image = np.array(data['image'])

    # Preprocess the image (example: resize)
    image = tf.image.resize(image, [224, 224])

    # Make a prediction
    predictions = model(tf.constant([image])) # Add batch dimension

    # Post-process (example: get the class with highest probability)
    predicted_class = np.argmax(predictions.numpy())

    # Return the result as JSON
    return jsonify({"predicted_class": int(predicted_class)})

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5000)  # Listen on all interfaces, port 5000

Create a requirements.txt file: List all the Python packages your prediction script depends on. This ensures that the correct environment can be created for your model.
```
# requirements.txt
flask
tensorflow
numpy
                    
```

Upload your files: Upload your model files, prediction script (`predict.py`), and `requirements.txt` file to your instance (using SCP, SFTP, or the notebook interface's file upload feature).
Choose a runtime: Select the appropriate runtime environment (e.g., Python 3.8 with TensorFlow 2.x) through your Tensorent dashboard.
Set resource limits: Specify the CPU, memory, and GPU resources allocated to your deployment. This prevents your model from consuming excessive resources and affecting other services.
Deploy the model: Initiate the deployment process through your dashboard. This typically involves:
- Creating a container (e.g., using Docker) that packages your model, script, and dependencies.
- Starting the container on your instance.
- Setting up a network endpoint (e.g., a URL with a port) that allows external access to your model.
Obtain the API endpoint: Once deployed, the dashboard will provide you with an API endpoint (a URL) that you can use to send prediction requests to your model. This endpoint might look like: `http://your-instance-ip:5000/predict`.
Test your deployment: Send a test request to your API endpoint to verify that it's working correctly. You can use tools like `curl` (command-line) or Postman (GUI) to send requests. Example (`curl`):
```
curl -X POST -H "Content-Type: application/json" -d '{"image": [[...image data...]]}' http://your-instance-ip:5000/predict
                
```
Replace `[...image data...]` with the actual image data in the format expected by your `predict.py` script.

2. Hosting Games and Applications

Hosting games and other applications is conceptually similar to deploying AI models, but the specific steps may vary depending on the application.

Prepare your application: Ensure your game or application is configured to run in a server environment. This might involve:
- Configuring network settings (ports, IP addresses).
- Setting up any required databases or external services.
- Creating configuration files.

Containerize (Recommended): Package your application and its dependencies into a container (e.g., using Docker). This makes deployment more reliable and portable.

# Example Dockerfile
FROM ubuntu:latest

# Install dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    python3 python3-pip 

# Copy your application files
WORKDIR /app
COPY . .

# Install Python requirements
RUN pip3 install -r requirements.txt

# Expose the necessary port (replace 8080 with your application's port)
EXPOSE 8080

# Run your application
CMD ["python3", "app.py"]

Upload: Upload your application files or Docker image to your instance.
Define server specs: Specify the CPU, memory, and GPU resources required by your application.
Network and security:
- Configure firewall rules to allow traffic to the appropriate ports.
- Consider setting up a domain name and SSL certificate for secure access (especially for web applications).
Launch: Start your application or container through the dashboard.
Monitor: Use the dashboard to monitor the resource usage and performance of your application.

API Reference

Tensorent provides a RESTful API for managing your instances programmatically. All API requests should be made over HTTPS. The base URL for the API is: `https://api.tensorent.com/v1` (This is a placeholder URL; use the actual URL provided by Tensorent).

Authentication

API requests are authenticated using API keys. You can generate API keys from your Tensorent dashboard. Include your API key in the `Authorization` header of each request:

Authorization: Bearer YOUR_API_KEY

Replace `YOUR_API_KEY` with your actual API key. Keep your API key secret and do not share it.

Instance Management

These endpoints allow you to create, manage, and delete GPU instances.

List Instances:

Method: `GET`
Endpoint: `/instances`
Description: Retrieves a list of all your instances.

Example Response (JSON):

{
  "instances": [
    {
      "id": "instance-12345",
      "name": "my-instance",
      "status": "running",
      "gpu_type": "A100",
      "ram": "64GB",
      "cpu_cores": 16,
      "public_ip": "203.0.113.42",
      "created_at": "2024-07-27T12:00:00Z"
    },
    {
      "id": "instance-67890",
      "name": "another-instance",
      "status": "stopped",
       "gpu_type": "RTX4090",
      "ram": "24GB",
      "cpu_cores": 12,
      "public_ip": null,
      "created_at": "2024-07-26T10:00:00Z"
    }
  ]
}

Create Instance:

Method: `POST`
Endpoint: `/instances`
Description: Creates a new GPU instance.

Request Body (JSON):

{
  "name": "my-new-instance",          // Required: A name for your instance
  "gpu_type": "RTX4090",             // Required: The GPU type
  "ram": "32GB",                     // Required: The amount of RAM
  "cpu_cores": 8,                    // Required: The number of CPU cores
  "root_disk_size": "100GB",        // Required: Size of the root disk
  "image": "ubuntu-20.04",          // Optional: The OS image (defaults to a standard Ubuntu image)
   "ssh_key_id": "key-abcde"      // Optional. ID of the SSH public key to add to the instance.
}

Example Response (JSON, on success):

{
  "id": "instance-abcdef",
  "name": "my-new-instance",
  "status": "provisioning",
   "gpu_type": "RTX4090",
  "ram": "32GB",
  "cpu_cores": 8,
  "root_disk_size": "100GB",
  "created_at": "2024-07-27T13:30:00Z"
}

Get Instance Details:

Method: `GET`
Endpoint: `/instances/{instance_id}`
Description: Retrieves details about a specific instance.
Parameters:
- `instance_id` (string, required): The ID of the instance.

Example Response (JSON):

{
  "id": "instance-12345",
  "name": "my-instance",
  "status": "running",
   "gpu_type": "A100",
  "ram": "64GB",
  "cpu_cores": 16,
  "public_ip": "203.0.113.42",
  "created_at": "2024-07-27T12:00:00Z",
  "volumes": [
      {
          "id": "vol-1",
          "size": "500GB",
          "type": "ssd"
      }
  ]
}

Start Instance:
- Method: `POST`
- Endpoint: `/instances/{instance_id}/start`
- Parameters:
  - `instance_id` (string, required): The ID of the instance.
- Description: Starts a stopped instance.
- Example Response (JSON, on success):
```
{
  "message": "Instance started successfully."
}
                        
```
Stop Instance:
- Method: `POST`
- Endpoint: `/instances/{instance_id}/stop`
- Parameters:
  - `instance_id` (string, required): The ID of the instance.
- Description: Stops a running instance.
- Example Response (JSON, on success):
```
{
  "message": "Instance stopped successfully."
}
                        
```
Delete Instance:
- Method: `DELETE`
- Endpoint: `/instances/{instance_id}`
- Parameters:
  - `instance_id` (string, required): The ID of the instance.
- Description: Permanently deletes an instance. **This action is irreversible.**
- Example Response (JSON, on success):
```
{
  "message": "Instance deleted successfully."
}
                        
```

Resource Monitoring

These endpoints provide real-time metrics on resource usage.

Get Instance Metrics:

Method: `GET`
Endpoint: `/instances/{instance_id}/metrics`
Parameters:
- `instance_id` (string, required): The ID of the instance.

Example Response (JSON):

{
  "instance_id": "instance-12345",
  "timestamp": "2024-07-27T14:00:00Z",
  "metrics": {
    "cpu_usage": 0.75,       // CPU utilization (0.0 - 1.0)
    "memory_usage": 0.9,     // Memory utilization (0.0 - 1.0)
    "gpu_usage": 0.85,     // GPU utilization (0.0 - 1.0)
    "gpu_memory_usage": 0.6, // GPU memory utilization (0.0 - 1.0)
    "network_rx_bytes": 123456,  // Network received bytes
    "network_tx_bytes": 654321   // Network transmitted bytes
  }
}

Snapshots

Snapshots allow you to create backups of your instances.

Create Snapshot:

Method: `POST`
Endpoint: `/instances/{instance_id}/snapshots`
Parameters:
- `instance_id` (string, required): The ID of the instance.

Request Body (JSON - Optional):

{
    "name": "my-snapshot"  //Optional Name for SnapShot
}

Description: Creates a snapshot of the specified instance.

Example Response:


{
    "snapshot_id": "snap-1a2b3c",
    "instance_id": "inst-4d5e6f",
    "status": "creating", // Initial status
    "created_at": "2024-03-08T14:30:00Z"
}

List Snapshots:

Method: `GET`
Endpoint: `/instances/{instance_id}/snapshots`
Parameters:
- `instance_id` (string, required): The ID of the instance.
Description: Lists all snapshots for a given instance.

Example Response:

[
    {
        "snapshot_id": "snap-1a2b3c",
        "instance_id": "inst-4d5e6f",
        "status": "available",
        "created_at": "2024-03-08T14:30:00Z"
    },
    {
       "snapshot_id": "snap-2g3h4i",
        "instance_id": "inst-4d5e6f",
        "status": "available",
        "created_at": "2024-03-07T10:00:00Z"
    }
]

Delete Snapshot:
- Method: `DELETE`
- Endpoint: `/snapshots/{snapshot_id}`
- Parameters:
  - `snapshot_id` (string, required): The ID of the snapshot.
- Description: Deletes a snapshot. **This action is irreversible.**
- Example Response (JSON, on success):
```
{
    "message": "Snapshot deleted successfully."
}
                        
```

Automated Scaling

This section is a placeholder. Automated scaling often requires custom setup and integration with monitoring services. Contact Tensorent support for details.

Possible API endpoints (Illustrative):

`/autoscaling/policies`: Manage autoscaling policies.
`/autoscaling/groups`: Manage groups of instances for scaling.

Error Handling

The API uses standard HTTP status codes to indicate success or failure. Common error codes include:

`400 Bad Request`: The request was malformed or invalid.
`401 Unauthorized`: Authentication failed (invalid or missing API key).
`403 Forbidden`: You do not have permission to access the requested resource.
`404 Not Found`: The requested resource was not found.
`429 Too Many Requests`: You have exceeded the API rate limit.
`500 Internal Server Error`: An unexpected error occurred on the server.

Error responses typically include a JSON body with more details:

{
  "error": {
    "code": "invalid_request",
    "message": "The 'gpu_type' parameter is required."
  }
}

Tensorent – Powering the Future of AI and Cloud Computing.

Using the Tensorent Dashboard

1. Logging In

2. Customizing Your Instance

3. SSH Access to Your Instance

4. Using the Integrated Notebook

Hosting AI Models and Applications

1. Deploying an AI Model

2. Hosting Games and Applications

API Reference

Authentication

Instance Management

Resource Monitoring

Snapshots

Automated Scaling

Error Handling