Importing Data
ProtSpace uses .parquetbundle files containing protein embeddings and annotations.
Drag and Drop (Recommended)
The easiest way to load data:
- Locate your
.parquetbundlefile on your computer - Drag it onto the scatterplot canvas
- Drop when you see the drop indicator
- Data loads automatically
Drop Anywhere
You can drop the file anywhere on the scatterplot area - it doesn't need to be a specific location.
Import Button
Alternatively, use the Import button in the control bar:
- Click the Import button in the top-right corner
- Select your
.parquetbundlefile from the file picker - Click Open
Example Datasets
Don't have data yet? Download example .parquetbundle files from the GitHub data folder.
What Happens When You Load Data
After successfully loading a file:
- Scatterplot populates: All proteins appear as colored points
- View restored or initialized: ProtSpace restores the requested URL annotation and projection when they exist in the dataset; otherwise it falls back to the first available options
- Settings restored: Previously saved or bundled customizations are applied
- Legend appears: Shows all categories with color assignments
- Ready to explore: You can now pan, zoom, and interact with the data
Loading Time
Small datasets (< 10K proteins) load instantly. Larger datasets may take a few seconds to process and render.
Data & Settings Persistence
All persistence is local to your browser — your data is never sent to a server.
- Your dataset is remembered: The last imported file is saved in your browser's Origin Private File System (OPFS) and automatically restored when you revisit ProtSpace. Switching to the demo dataset clears the stored file.
- Settings persist per dataset: Legend customizations (colors, shapes, hidden categories, sort order) and export options are saved in browser storage for each dataset. When you reload or revisit the same dataset, your settings are restored.
- Annotation and projection persist in the URL: ProtSpace keeps the currently selected annotation and projection in the page URL as query parameters (
annotation=...andprojection=...). Refreshing the page, using the browser's back/forward buttons, or sharing the link will restore the same view when those options exist in the active dataset. A bare/exploreURL stays unchanged on first load; ProtSpace only writes view params after you change the selection or when it needs to normalize an invalid URL value. - File-embedded settings take priority: If a
.parquetbundleincludes saved settings (via the export dialog's "Include legend/export settings" options), those are applied on import, replacing any previously stored settings for that dataset. - Starting fresh: To reset all settings for a dataset, re-import a
.parquetbundlethat has embedded settings, or clear site data in your browser settings.
URL-backed view state
If the URL points to an annotation or projection that does not exist in the currently loaded dataset, ProtSpace falls back to the closest valid view and updates the URL to match.
Automatic dataset restore requires OPFS
ProtSpace uses the Origin Private File System (OPFS) to restore your last imported dataset after a page reload.
OPFS may be unavailable in private/incognito browsing mode, when browser storage is restricted, or in older browsers that do not support it.
ProtSpace still works normally without OPFS. Your dataset loads for the current session, but you will need to import it again after reloading the page.
Need a Data File?
To create your own .parquetbundle files:
- Using Google Colab - No installation required (recommended)
- Using Python CLI - For local processing or automation
Or download example datasets from the GitHub data folder.