Simulation

Abstract

This tutorial guides you through an example material synthesis workflow using the CRIPT Python SDK.

Installation¶

Before you start, be sure the cript python package is installed.

pip install cript

Connect to CRIPT¶

To connect to CRIPT, you must enter a host and an API Token. For most users, host will be https://api.criptapp.org/.

Keep API Token Secure

To ensure security, avoid storing sensitive information like tokens directly in your code. Instead, use environment variables. Storing tokens in code shared on platforms like GitHub can lead to security incidents. Anyone that possesses your token can impersonate you on the CRIPT platform. Consider alternative methods for loading tokens with the CRIPT API Client. In case your token is exposed be sure to immediately generate a new token to revoke the access of the old one and keep the new token safe.

import cript

with cript.API(host="https://api.criptapp.org/", api_token="123456", storage_token="987654") as api:
    pass

Note

You may notice, that we are not executing any code inside the context manager block. If you were to write a python script, compared to a jupyter notebook, you would add all the following code inside that block. Here in a jupyter notebook, we need to connect manually. We just have to remember to disconnect at the end.

api = cript.API(
    host="https://api.criptapp.org/", api_token=None, storage_token="123456"
)
api = api.connect()

Create a Project¶

All data uploaded to CRIPT must be associated with a Project node. Project can be thought of as an overarching research goal. For example, finding a replacement for an existing material from a sustainable feedstock.

# create a new project in the CRIPT database
project = cript.Project(name="My simulation project.")

Create a Collection node ¶

For this project, you can create multiple collections, which represent a set of experiments. For example, you can create a collection for a specific manuscript, or you can create a collection for initial screening of candidates and one for later refinements etc.

So, let's create a collection node and add it to the project.

collection = cript.Collection(name="Initial simulation screening")
# We add this collection to the project as a list.
project.collection += [collection]

Viewing CRIPT JSON

Note, that if you are interested into the inner workings of CRIPT, you can obtain a JSON representation of your data at any time to see what is being sent to the API through HTTP JSON requests.

print(project.json)

Format JSON in terminal

Format the JSON within the terminal for easier reading

print(project.get_json(indent=2).json)

Create an Experiment node ¶

The Collection node holds a series of Experiment nodes nodes.

And we can add this experiment to the collection of the project.

experiment = cript.Experiment(name="Simulation for the first candidate")
collection.experiment += [experiment]

Create relevant Software nodes ¶

Software nodes refer to software that you use during your simulation experiment. In general Software nodes can be shared between project, and it is encouraged to do so if the software you are using is already present in the CRIPT project use it.

If They are not, you can create them as follows:

python = cript.Software(name="python", version="3.9")

rdkit = cript.Software(name="rdkit", version="2020.9")

stage = cript.Software(
    name="stage", source="https://doi.org/10.1021/jp505332p", version="N/A"
)

packmol = cript.Software(
    name="Packmol", source="http://m3g.iqm.unicamp.br/packmol", version="N/A"
)

openmm = cript.Software(name="openmm", version="7.5")

Generally, provide as much information about the software as possible this helps to make your results reproducible. Even a software is not publicly available, like an in-house code, we encourage you to specify them in CRIPT. If a version is not available, consider using git-hashes.

Create Software Configuration ¶

Now that we have our Software nodes, we can create SoftwareConfiguration nodes. SoftwareConfigurations nodes are designed to let you specify details, about which algorithms from the software package you are using and log parameters for these algorithms.

The SoftwareConfigurations are then used for constructing our Computation node, which describe the actual computation you are performing.

We can also attach Algorithm nodes to a SoftwareConfiguration node. The Algorithm nodes may contain nested Parameter nodes, as shown in the example below.

# create some software configuration nodes
python_config = cript.SoftwareConfiguration(software=python)
rdkit_config = cript.SoftwareConfiguration(software=rdkit)
stage_config = cript.SoftwareConfiguration(software=stage)

# create a software configuration node with a child Algorithm node
openmm_config = cript.SoftwareConfiguration(
    software=openmm,
    algorithm=[
        cript.Algorithm(
            key="energy_minimization",
            type="initialization",
        ),
    ],
)
packmol_config = cript.SoftwareConfiguration(software=packmol)

Algorithm keys

The allowed Algorithm keys are listed under algorithm keys in the CRIPT controlled vocabulary.

Parameter keys

The allowed Parameter keys are listed under parameter keys in the CRIPT controlled vocabulary.

Create Computations ¶

Now that we've created some SoftwareConfiguration nodes, we can used them to build full Computation nodes. In some cases, we may also want to add Condition nodes to our computation, to specify the conditions at which the computation was carried out. An example of this is shown below.

# Create a ComputationNode
# This block of code represents the computation involved in generating forces.
# It also details the initial placement of molecules within a simulation box.
init = cript.Computation(
    name="Initial snapshot and force-field generation",
    type="initialization",
    software_configuration=[
        python_config,
        rdkit_config,
        stage_config,
        packmol_config,
        openmm_config,
    ],
)

# Initiate the simulation equilibration using a separate node.
# The equilibration process is governed by specific conditions and a 
# set equilibration time.
# Given this is an NPT (Number of particles, Pressure, Temperature) 
# simulation, conditions such as the number of chains, temperature, 
# and pressure are specified.
equilibration = cript.Computation(
    name="Equilibrate data prior to measurement",
    type="MD",
    software_configuration=[python_config, openmm_config],
    condition=[
        cript.Condition(key="time_duration", type="value", value=100.0, unit="ns"),
        cript.Condition(key="temperature", type="value", value=450.0, unit="K"),
        cript.Condition(key="pressure", type="value", value=1.0, unit="bar"),
        cript.Condition(key="number", type="value", value=31),
    ],
    prerequisite_computation=init,
)

# This section involves the actual data measurement.
# Note that we use the previously computed data as a prerequisite. 
# Additionally, we incorporate the input data at a later stage.
bulk = cript.Computation(
    name="Bulk simulation for measurement",
    type="MD",
    software_configuration=[python_config, openmm_config],
    condition=[
        cript.Condition(key="time_duration", type="value", value=50.0, unit="ns"),
        cript.Condition(key="temperature", type="value", value=450.0, unit="K"),
        cript.Condition(key="pressure", type="value", value=1.0, unit="bar"),
        cript.Condition(key="number", type="value", value=31),
    ],
    prerequisite_computation=equilibration,
)

# The following step involves analyzing the data 
# from the measurement run to ascertain a specific property.
ana = cript.Computation(
    name="Density analysis",
    type="analysis",
    software_configuration=[python_config],
    prerequisite_computation=bulk,
)

# Add all these computations to the experiment.
experiment.computation += [init, equilibration, bulk, ana]

Computation types

The allowed Computation types are listed under computation types in the CRIPT controlled vocabulary.

Condition keys

The allowed Condition keys are listed under condition keys in the CRIPT controlled vocabulary.

Create and Upload Files nodes ¶

New we'd like to upload files associated with our simulation. First, we'll instantiate our File nodes under a specific project.

packing_file = cript.File(
    name="Initial simulation box snapshot with roughly packed molecules",
    type="computation_snapshot",
    source="path/to/local/file",
    extension=".csv",
)

forcefield_file = cript.File(
    name="Forcefield definition file",
    type="data",
    source="path/to/local/file",
    extension=".pdf",
)

snap_file = cript.File(
    name="Bulk measurement initial system snap shot",
    type="computation_snapshot",
    source="path/to/local/file",
    extension=".png",
)

final_file = cript.File(
    name="Final snapshot of the system at the end the simulations",
    type="computation_snapshot",
    source="path/to/local/file",
    extension=".jpeg",
)

Note

The source field should point to any file on your local filesystem or a web URL to where the file can be found.

For example, CRIPT protein JSON file on CRIPTScripts

Note, that we haven't uploaded the files to CRIPT yet, this is automatically performed, when the project is uploaded via api.save(project).

Create Data¶

Next, we'll create a Data node which helps organize our File nodes and links back to our Computation objects.

packing_data = cript.Data(
    name="Loosely packed chains",
    type="computation_config",
    file=[packing_file],
    computation=[init],
    notes="PDB file without topology describing an initial system.",
)

forcefield_data = cript.Data(
    name="OpenMM forcefield",
    type="computation_forcefield",
    file=[forcefield_file],
    computation=[init],
    notes="Full forcefield definition and topology.",
)

equilibration_snap = cript.Data(
    name="Equilibrated simulation snapshot",
    type="computation_config",
    file=[snap_file],
    computation=[equilibration],
)

final_data = cript.Data(
    name="Logged volume during simulation",
    type="computation_trajectory",
    file=[final_file],
    computation=[bulk],
)

Data types

The allowed Data types are listed under the data types in the CRIPT controlled vocabulary.

Next, we'll link these Data nodes to the appropriate Computation nodes.

# Observe how this step also forms a continuous graph, 
# enabling data to flow from one computation to the next.
# The sequence initiates with the computation process 
# and culminates with the determination of the material property.
init.output_data = [packing_data, forcefield_data]
equilibration.input_data = [packing_data, forcefield_data]
equilibration.output_data = [equilibration_snap]
ana.input_data = [final_data]
bulk.output_data = [final_data]

Create a virtual Material¶

First, we'll create a virtual material with identifiers to make it easier to search for.

# create a material node object with identifiers
polystyrene = cript.Material(name="virtual polystyrene", bigsmiles="[H]{[>][<]C(C[>])c1ccccc1[<]}C(C)CC", names = ["poly(styrene)", "poly(vinylbenzene)"], chem_repeat= ["C8H8"])

Add `Property` sub-objects¶

Let's also add some Property nodes to the Material, which represent its physical or virtual (in the case of a simulated material) properties.

phase = cript.Property(key="phase", value="solid", type="none", unit=None)
color = cript.Property(key="color", value="white", type="none", unit=None)

polystyrene.property += [phase]
polystyrene.property += [color]

Material property keys

The allowed material Property keys are listed in the material property keys in the CRIPT controlled vocabulary.

Create `ComputationalForcefield`¶

Finally, we'll create a ComputationalForcefield node and link it to the Material.

forcefield = cript.ComputationalForcefield(
    key="opls_aa",
    building_block="atom",
    source="Custom determination via STAGE",
    data=[forcefield_data],
)

polystyrene.computational_forcefield = forcefield

Computational forcefield keys

The allowed ComputationalForcefield keys are listed under the computational forcefield keys in the CRIPT controlled vocabulary.

Now we can save the project to CRIPT (and upload the files) or inspect the JSON output

Validate CRIPT Project Node¶

# Before we can save it, we should add all the orphaned nodes to the experiments.
# It is important to do this for every experiment separately, 
# but here we only have one.
cript.add_orphaned_nodes_to_project(project, active_experiment=experiment)
project.validate()

# api.save(project)
print(project.get_json(indent=2).json)

# Let's not forget to close the API connection after everything is done.
api.disconnect()

Conclusion¶

You made it! We hope this tutorial has been helpful.

Please let us know how you think it could be improved. Feel free to reach out to us on our CRIPT Python SDK GitHub. We'd love your inputs and contributions!

Simulation

Installation¶

Connect to CRIPT¶

Create a Project¶

Create a Collection node¶

Create an Experiment node¶

Create relevant Software nodes¶

Create Software Configuration¶

Create Computations¶

Create and Upload Files nodes¶