Skip to content

Using Python CLI

For local processing or automation, use the ProtSpace Python package.

Installation

bash
pip install protspace

Commands

From UniProt Query

bash
protspace-query -q "(ft_domain:kinase) AND (reviewed:true)" -m pca2,umap2

From Your Own Embeddings

bash
protspace-local -i embeddings.h5 -m pca2,umap2

Parameters

Both commands share similar parameters. The key difference:

  • protspace-query uses -q to specify a UniProt query
  • protspace-local uses -i to specify an input embeddings file
ParameterDescriptionCommand
-qUniProt query stringquery only
-iInput embeddings (HDF5)local only
-oOutput directoryboth
-mProjection methodsboth
-fAnnotations (names or CSV path)both

Annotations

Specify annotations with -f:

bash
# By name (auto-retrieved)
-f protein_family,reviewed,pfam,genus,species

# Or provide a CSV file
-f annotations.csv

CSV format:

csv
identifier,taxonomy,family,function
P12345,Bacteria,Kinase,ATP binding
P67890,Archaea,Phosphatase,Hydrolase
Q54321,Eukaryota,Kinase,Transferase

The identifier column must match protein IDs in your embeddings file.

Projection Methods

Methods require a dimension suffix: 2 for 2D, 3 for 3D.

Dimension Suffix Required

Specify pca2 or pca3, not pca alone - the dimension suffix is mandatory.

Method2D3DDescription
PCApca2pca3Principal Component Analysis
UMAPumap2umap3Uniform Manifold Approximation
t-SNEtsne2tsne3t-distributed Stochastic Neighbor Emb.
PaCMAPpacmap2pacmap3Pairwise Controlled Manifold Approx.
MDSmds2mds3Multidimensional Scaling

TIP

ProtSpace is optimized for 2D visualization - prefer *2 methods over *3.

More Info

Find full docs and more examples on the ProtSpace Python GitHub.

Released under the Apache 2.0 License.