Skip to content

Self-hosted Neptune deployment configuration#

Self-hosted Neptune is composed of a set of microservices (Neptune Services) distributed as a Helm chart deployable on Kubernetes.

The Neptune installer consists of two files:

  • neptune_installation_{version}.tgz – needs to be unpacked. It contains a single directory called neptune_installation.
  • configuration.yaml – contains a minimal installation configuration. Before starting the installation, modify the defaults to suit your deployment scenario if needed.

Warning

The configuration.yaml file may contain Neptune's Docker registry credentials and your license for using Neptune. Treat it confidentialLy.

Depending on the installation type, some parameters are optional.

General configuration#

This section describes some basic options you need to configure before installing Neptune.

Parameter        Description Example value
administrator_username Desired username of your administrator account. Can contain letters, numbers, and hyphens. Recommended: administrator
administrator_password Password for the administrator account. Use a strong password.
organization_name Name of your organization in Neptune. Can contain letters, numbers and hyphens. Something simple, related to your company or organization.
deployment_type Set to local in case of single-node deployment, or cluster in case of deployment on existing Kubernetes cluster. -

Kubernetes cluster configuration#

Use options in this section to tell the Neptune installer how to interact with your Kubernetes cluster.

Note

Only use these options with cluster deployments (deployment_type set to cluster).

Parameter        Description Example value
kubeconfig_path Absolute path to existing kubectl configuration file pointing to the cluster where you want to deploy Neptune. /etc/rancher/k3s/k3s.yaml
namespace Kubernetes namespace where Neptune will be deployed. Created if it doesn't exist. Default: neptune
node_tolerations Array of node taint tolerations that the Neptune installation can use. This is copied directly to Kubernetes manifests. -
node_selector Key-value map that the Neptune installation should use to select nodes. This is copied directly to Kubernetes manifests. -

Storage configuration#

Use the following options to configure how to provision storage for Neptune in your deployment.

The types of data that Neptune stores can be roughly split into two categories:

  1. Service storage – the storage required by MySQL, Elasticsearch, and Kafka. This data is always stored on a POSIX-compliant file system.
  2. Object storage – for the bulk of your data. This refers to ML metadata that Neptune tracks and stores, such as logs, numerical series, tabular data, and images. This data can be stored either on a POSIX-compliant file system or in S3-compatible object storage.

For more details about storage requirements and options, see Neptune's system requirements.

Regardless of the deployment type, you can always provide MySQL, Elasticsearch, Kafka, and S3-compatible object storage, in which case Neptune doesn't require additional storage.

Service storage#

If you prefer to use the MySQL, Elasticsearch, and Kafka services provided as part of the Neptune installer, they need a way to provision storage for themselves. The options depend on the deployment type.

Parameter        Description Example value
storage_device Should point to an SSD device. You can retrieve the value from the output of the lsblk command. For a clean VM, it's often /dev/sdb.
storage_path Absolute path on the local system where Neptune will store its data. This is also the space where the disk is mounted. Default: /mnt/neptune

If you use MySQL, Kafka, or Elasticsearch delivered as part of the Neptune installation, the following parameters allow Neptune to provision disk space for them:

Parameter         Description
ssd_storage_class Name of the storage class to be used by MySQL and Elasticsearch services.
hdd_storage_class Name of the storage class to be used for Kafka.

Object storage#

Note for local deployments

Object storage can be provisioned automatically by the installer.

You can implement object storage in one of the following ways:

  • By providing your own file storage with the storage_pvc_name option.
  • With S3-compatible storage.

Options in this section work regardless of whether you're deploying Neptune in local or cluster mode.

Neptune stores most of your data either on a POSIX file system or in object storage. In most situations, we strongly recommend using object storage.

Parameter         Description
storage_pvc_name

The name of the Persistent Volume Claim present in the namespace to which Neptune will be installed. This claim needs to be of type ReadWriteMany and the underlying storage has to be POSIX-compliant.

This parameter is disregarded if S3-compatible object storage is provided.

2.2 migration note

If you're migrating from Neptune version 2.1 to 2.2, the storage_pvc_name value will serve as the source for the migration. No new data will be written there.

Parameter        Description
s3_bucket_name Name of the bucket Neptune uses for storing most of your data.
s3_service_endpoint The S3 service endpoint to connect to.
  • For a description of one of the S3 service endpoints, see the AWS docs .
  • If you're using an S3-compatible service, you should set this to the service endpoint. For example, on GCP: https://storage.googleapis.com/
s3_region The AWS region corresponding to the s3_service_endpoint parameter. For S3-compatible services, the value depends on the service.
s3_access_key_id An S3 access key.
s3_secret_access_key An S3 secret key.

Database configuration#

The Neptune installer can set up a MySQL database as part of the installation process, but you can also provide one yourself. You may want to do this especially in cloud environments, where the storage used by the database is automatically scaled.

Parameter        Description
db_host External MySQL database host for Neptune to use.
db_port (Optional) External MySQL database port. Default: 3306
db_username The username of a user with access to all schemes used by Neptune. Required if db_host is set.
db_password The password of the user specified in db_username. Required if db_host is set.

Required database schemas#

Before running the Neptune installer, create the listed database schemas and grant the user defined in db_username access to them.

Make sure that the schemas have UTF-8 as the default character set.

  • neptune_instance
  • neptune_notifications
  • neptune_discussions
  • neptune_leaderboard
  • neptune_keycloak
  • neptune_artifacts

Elasticsearch configuration#

The Neptune installer can set up an Elasticsearch instance as part of the installation process, but you can also provide one yourself.

Parameter          Description
elasticsearch_address Address of the external Elasticsearch server host in format https://<address>[:<http api port>]. The <http api port> is optional and defaults to 443 for HTTPS connections.
elasticsearch_cluster_name Name of your Elasticsearch cluster. Defaults to elasticsearch, which is the default cluster name in Elasticsearch.

Kafka configuration#

The Neptune installer can set up a Kafka instance as part of the installation process, but you can also provide one yourself.

Parameter        Description
kafka_address Address of the external Kafka cluster, as a comma-separated list of hosts. Required format:

<address 1>:<port>,<address 2>:<port>,...<address N>:<port>

Additional configuration#

Identity management system#

In some cases, you may want to connect Neptune to your own identity management system (like LDAP).

If your identity management system uses a certificate issued by your own Certificate Authority, you need to provide a set of trusted certificates to Neptune (so that it's able to verify the security of the connection).

Parameter        Description
trusted_certificates List of absolute paths to files containing certificates that are to be trusted.
keycloak_java_opts Rarely needed option that allows overriding some default behaviors of Java. The value is a string containing Java options that are to be added to Keycloak (the component responsible for connecting to an external identity management system).

Example value: "-Djdk.tls.client.protocols=TLSv1.0,TLSv1.1,TLSv1.2"

Ingress controller#

Head on to Expose Neptune for how to configure the ingress controller and expose Neptune from your VM or cluster to the outside.