EMBEDDING API
Generate optimal embeddings for any ML task
Our embedding API creates a statistically optimal representation of your dataset for your task, allowing faster training of production-ready models on limited, sparse, and high dimensional data — without extensive feature engineering.
Schedule a Discovery Call
Don’t let imperfect data get in the way of great models
By learning how to represent complex data relationships in a pure statistical form, Dark Matter gets performant models working faster without extensive feature engineering or resource-intensive deep learning — enabling data scientists to spend less time on data and more time solving hard problems.
Case study: how we improved models trained on a limited, sparse, and high-dimensional dataset
We sought out a difficult task with many sparse features, difficult to discern signal-to-noise ratio, and a small number of examples in a domain where we had zero expertise to demonstrate the capabilities of Dark Matter. Against baseline performance, it improved model metrics across the board.
Enhance your pipeline, no matter your model or domain.
Slots in Seamlessly
Surprisingly lightweight, Dark Matter represents a transformative new step in the data science pipeline that doesn’t alter your existing processes.
Domain and Model Agnostic
We make any model in any domain better simply by creating richer representations of the relationships in your existing data.
Secure Integration
Integration is available on-premises or via cloud API. Retain total control of your pipeline, keeping the privacy and integrity of your data intact.
Optimize training and inference across tasks
Forecasting
Price predictions
Supply and demand
Customer churn
Recommendations
Ad placement
Content suggestions
Product personalization
Specialized Tasks
Chemical discovery
Sensor data
Virus-host interactions
Seamlessly slots into your existing ML pipeline.
Dark Matter is a surprisingly lightweight solution that integrates seamlessly into your pipeline — either on-prem or via cloud API. So you maintain end-to-end control of your ML process, data, and models.
Securely installs in under 5 minutes
// Import
import ensemblecore as ec
// Authentication
user = ec.User()
user.login(username='USERNAME', password='PASSWORD', token='TOKEN')
Backed by:
Mark Nelson
Former CEO of Tableau
Research
Feature Enhancement: A New Approach for Representation Learning (Whitepaper)
Discover a novel approach to representing complex, non-linear relationships inherent in real-world data.
Feature Programming for Multivariate Time Series Prediction (ICML)
Learn about a new framework for automated feature engineering from noisy time series data.Resources
Blog
Op-eds and thoughts on the state of machine learning and AIDocumentation
Developer support, API docs, quick-start guidePublished Research
Ensemble research, papers, and conferencesFrequently Asked Questions
Dark Matter is available for on-premises installation on your machine and using your compute resources, enabling you to retain complete control over your proprietary data. On-prem deployment ensures that we never see your data.
We encourage you to try Dark Matter with your data and model to compare the results with your existing pipeline. Most customers use it in a testing environment with sample data to minimize resource requirements before putting it into production. If you’d like to set up a trial, please fill out the form here and we will be in touch.
While Dark Matter does create new variables, its mechanics are fundamentally different. Traditional synthetic data recreates existing distributions from Gaussian noise, so no new information is created. This has the virtue of anonymizing data (which is essential for some regulated industries), but it has minimal impact on predictive accuracy as it mirrors the statistical properties of your data.
In contrast, Dark Matter learns how to create embeddings that have different statistical properties and distributions. Using our new machine learning algorithm, it’s able to converge on nearly orthogonal features that measurably improve predictive accuracy.
One of the primary benefits of Dark Matter is that it lowers the barrier to useful predictive performance by creating richer representations of your data. That said, there is a theoretical minimum threshold of data quality and volume that can be useful (i.e. if what you’re working with is mostly noise, it probably won’t help). Our rule of thumb is that if you have a working data science pipeline that’s generating mediocre predictions, Dark Matter can improve its performance.