An open source machine learning platform for scientists
Accessing ML-ready datasets just became the easiest part of your workflow
We make collaboration and reuse easy
dataset = f.get_dataset("foundry_stan_segmentation_v1.1")
dataset.get_as_dict()
and save you valuable time
No need to spend time making sense of someone else's data. We collect information about each dataset to take out any guesswork on your part. Our schema requires key information for every single dataset we host, making it easy to interpret each one. Our publishing process also includes a human review to make sure it meets our data standards.
Key | Type | Units | Description |
---|---|---|---|
reference | input | source publication of the band gap value | |
structure | input | the structure of this compound | |
bandgap value (eV) | target | eV | tvalue of the band gap |
Created for sharing large datasets
Our infrastructure was built with large datasets in mind. Share data that common infrastructures (email, DropBox, GitHub) can't. We use Globus to easily transfer data to anywhere you want to use it - from a laptop to a supercomputer.
Foundry-ML is part of the Materials Data Facility. MDF collects high-quality scientific datasets from the community and makes them easy to find and use. Foundry-ML datasets are a subset of MDF datasets that are structured and can be accessed programatically. For datasets that have less structure, or don't need to be accessed programatically, consider using MDF directly instead of Foundry-ML.
Build Foundry With Us
At Foundry, we believe accessibility and collaboration are the keys to problem solving. That's why we've made it super easy for you to share and use data. If you have a dataset you'd like to contribute, our publishing guide can set you up to do that.
Check out our documentation to see how easy it is to use Foundry, and follow our examples to do it all yourself.