Radically smaller models without compromise.
Our Model Shrinking Platform allows you cut training & inference costs without sacrificing performance. Upload any custom or open-source and let us do the rest.
Introducing Dark Matter, a new step in the data science pipeline
Dark Matter uses a new objective function that takes snapshots of the loss landscape and learns how to create statistically optimal features from those snapshots. They’re encoded into an embedding that represents new information to your model, effectively distilling signal from noise to make relationships easy for the model to understand.
Not only does this improve model metrics like accuracy, precision, and f1 scores across the board, it also means less time spent on data and feature engineering, freeing up your team to focus on experimentation and applications.
Check out our Research and Documentation for more technical details.
Large Language Models don't have to be Large (or expensive)
Shrink parameter count for any off-the-shelf model from Hugging Face or your own custom models without giving up the accuracy you count on.
Cost Efficient
2x smaller models ⇒ less money spent on training, finetuning, & inference.
Lower Latency
Smaller model ⇒ faster inferencing ⇒ superior customer experience.
Fully Multimodal
Compatible with any unimodal or multimodal model.
Highly Accurate
Maintain model performance across benchmarks every time.
Securely installs in under 5 minutes
Dark Matter is a surprisingly lightweight solution that slots seamlessly into your data science pipeline with just a few lines of code. Integrates easily on-prem for total privacy or run via cloud API for rapid scalability.
// Import
import ensemblecore as ec
// Authentication
user = ec.User()
user.login(username='USERNAME', password='PASSWORD', token='TOKEN')
Improving predictive power across applications
Dark Matter boosts the productivity of any ML pipeline, reducing training compute and preserving valuable resources. Works regardless of industry, model type, or prediction task — even with limited data.
Example Applications

Forecasting
Price predictions
Supply and demand
Customer churn

Recommendations
Ad placement
Content suggestions
Product personalization

Optimized Training
Reduces compute
Train on limited / sparse data
Frequently Asked Questions
Dark Matter is available for on-premises installation on your machine and using your compute resources, enabling you to retain complete control over your proprietary data. On-prem deployment ensures that we never see your data.
We encourage you to try Dark Matter with your data and model to compare the results with your existing pipeline. Most customers use it in a testing environment with sample data to minimize resource requirements before putting it into production. If you’d like to set up a trial, please fill out the form here and we will be in touch.
While Dark Matter does create new variables, its mechanics are fundamentally different. Traditional synthetic data recreates existing distributions from Gaussian noise, so no new information is created. This has the virtue of anonymizing data (which is essential for some regulated industries), but it has minimal impact on predictive accuracy as it mirrors the statistical properties of your data.
In contrast, Dark Matter learns how to create embeddings that have different statistical properties and distributions. Using our new machine learning algorithm, it’s able to converge on nearly orthogonal features that measurably improve predictive accuracy.
One of the primary benefits of Dark Matter is that it lowers the barrier to useful predictive performance by creating richer representations of your data. That said, there is a theoretical minimum threshold of data quality and volume that can be useful (i.e. if what you’re working with is mostly noise, it probably won’t help). Our rule of thumb is that if you have a working data science pipeline that’s generating mediocre predictions, Dark Matter can improve its performance.
Dark Matter is available for on-premises installation on your machine and using your compute resources, enabling you to retain complete control over your proprietary data. On-prem deployment ensures that we never see your data.
We encourage you to try Dark Matter with your data and model to compare the results with your existing pipeline. Most customers use it in a testing environment with sample data to minimize resource requirements before putting it into production. If you’d like to set up a trial, please fill out the form here and we will be in touch.
While Dark Matter does create new variables, its mechanics are fundamentally different. Traditional synthetic data recreates existing distributions from Gaussian noise, so no new information is created. This has the virtue of anonymizing data (which is essential for some regulated industries), but it has minimal impact on predictive accuracy as it mirrors the statistical properties of your data.
In contrast, Dark Matter learns how to create embeddings that have different statistical properties and distributions. Using our new machine learning algorithm, it’s able to converge on nearly orthogonal features that measurably improve predictive accuracy.
One of the primary benefits of Dark Matter is that it lowers the barrier to useful predictive performance by creating richer representations of your data. That said, there is a theoretical minimum threshold of data quality and volume that can be useful (i.e. if what you’re working with is mostly noise, it probably won’t help). Our rule of thumb is that if you have a working data science pipeline that’s generating mediocre predictions, Dark Matter can improve its performance.
Backed by:




Research
Feature Enhancement: A New Approach for Representation Learning (Whitepaper)
Discover a novel approach to representing complex, non-linear relationships inherent in real-world data.