An API is available for programmatic access to CrystaLLM. The API is located at:
https://api.crystallm.com/v1/generate
To use the service, an API key is required, which can be obtained by contacting us. Please note that there is a fee for using the API; see Pricing for more information. Alternatively, the model code and weights can be freely downloaded and used on personal hardware. The open source code repository contains instructions for setting up and using CrystaLLM.
Requests
To invoke CrystaLLM through the API, make a POST
request, sending the API key in
a header named x-api-key
. Place the message payload in the body of the request.
For example, to request a structure for Ba2MnCr, with Z=3, in space group R3m:
curl -X POST \ -H "Content-Type: application/json" \ -H "x-api-key: YOUR_API_KEY" \ -d '{"model": "small", "message": {"comp": "Ba2MnCr", "z": 3, "sg": "R-3m"}}' \ https://api.crystallm.com/v1/generate
In the example above, the body of the request contains the following properties:
model
: the model to use; can be "small", "medium", or "large" (required)comp
: the composition; must represent a single formula unit (required)z
: the number of formula units in the unit cell (required)sg
: the space group symbol (optional)
As stated above, the sg
property is optional. For example, to request a structure for
Ba2MnCr with Z=3, and allow the space group to be determined during generation:
curl -X POST \ -H "Content-Type: application/json" \ -H "x-api-key: YOUR_API_KEY" \ -d '{"model": "small", "message": {"comp": "Ba2MnCr", "z": 3}}' \ https://api.crystallm.com/v1/generate
Model Types
There are currently 3 kinds of models available: "small", "medium", and
"large". The "small" model consists of ~25 million parameters. It has a request timeout of
60 seconds, and will generate a maximum of 1,200 tokens. It has difficulty generating valid CIFs that are
longer than 500 tokens. However, it is the fastest of the models.
The "medium" model consists of ~85 million parameters, and the "large" model consists
of ~200 million parameters. These models have a request timeout of 120 seconds, and will generate a maximum
of 2,000 tokens. These models are generally better than the "small" model, and can usually generate CIFs
successfully that contain 1,000 or more tokens. However, these models are slower than the "small"
model. Specify the model to be used in the model
property of the payload.
Formation Energy
The formation energy per atom may be requested together with the generated structure. The formation energy
per atom is predicted with ALIGNN. To
include the ALIGNN-predicted formation energy per atom in the response, add the fe
property
with a value of true
to the message payload. For example, to request a structure for
NaCl, with Z=4, in space group
Fm3m, together with its
predicted formation energy per atom:
curl -X POST \ -H "Content-Type: application/json" \ -H "x-api-key: YOUR_API_KEY" \ -d '{"model":"small", "message":{"comp":"NaCl", "z":4, "sg":"Fm-3m"}, "fe":true}' \ https://api.crystallm.com/v1/generate
Message Options
The message
property of the payload contains the information that will be used to construct
the prompt that will be given to the model. The comp
property contains the
composition of the formula unit. The z
property contains the number of formula units in the
unit cell. The sg
property specifies the space group. The sg
property is optional,
whereas the comp
and z
properties are always required.
Valid values of Z
Any integer value > 0 is a valid value of z
. However, note that the models generally perform
best when z
is one of 1, 2, 3, 4, 6, or 8.
Valid values of Space group
The following values for the sg
property are supported:
P6/mmm | Imma | P4_32_12 | P4_2/mnm | Fd-3m | P3m1 | P-3 | P4mm |
P4_332 | P4/nnc | P2_12_12 | Pnn2 | Pbcn | P4_2/n | Cm | R3m |
Cmce | Aea2 | P-42_1m | P-42m | P2_13 | R-3 | Fm-3 | Cmm2 |
Pn-3n | P6/mcc | P3_2 | P-3m1 | P3_212 | I23 | P-62m | P4_2nm |
Pma2 | Pmma | I-42m | P-31c | Pa-3 | Pmmn | Pmmm | P4_2/ncm |
I4/mcm | I-4m2 | P3_1 | Pcc2 | Cmcm | I222 | Fddd | P312 |
Cccm | P6_1 | F-43c | P6_322 | Pm-3 | P3_121 | P6_4 | Ia-3d |
Pm-3m | P2_1/c | C222_1 | Pc | P4/n | Pba2 | Ama2 | Pbcm |
P31m | Pcca | P222 | P-43n | Pccm | P6_422 | F23 | P42_12 |
C222 | Pnnn | P6_3cm | P4_12_12 | P6/m | Fmm2 | I4_1/a | P4/mbm |
Pmn2_1 | P4_2bc | P4_22_12 | I-43d | I4/m | P4bm | Fdd2 | P3 |
P6_122 | Pnc2 | P4_2/mcm | P4_122 | Cmc2_1 | P-6c2 | R32 | P4_1 |
P4_232 | Pnna | P422 | Pban | Cc | I4_122 | P6_3/m | P6_3mc |
I4_1/amd | P4_2 | P4/nmm | Pmna | P4/m | Fm-3m | P4/mmm | Imm2 |
P4/ncc | P-62c | Ima2 | P6_5 | P2/c | P4/nbm | Ibam | P6_522 |
P6_3/mmc | I4/mmm | Fmmm | P2/m | P-4b2 | I-4 | C2/m | P4_2/mmc |
P4 | Fd-3c | P4_3 | P2_1/m | I-43m | P-42c | F4_132 | Pm |
Pccn | P-4n2 | P4_132 | P23 | I4cm | R3c | Amm2 | Immm |
Iba2 | I4 | Fd-3 | P1 | Pbam | P4_2/nbc | Im-3 | P4_2/nnm |
Pmc2_1 | P-31m | R-3m | Ia-3 | P622 | F222 | P2 | P-1 |
Pmm2 | P-4 | Aem2 | P6_222 | P-3c1 | P4_322 | I422 | Pnma |
P6_3 | P3c1 | Pn-3 | P4nc | P-6 | P4/mcc | I2_12_12_1 | P4_2/mbc |
P31c | Ccc2 | P4_2/nmc | P6_3/mcm | C2 | Pbca | P-4c2 | I4_1cd |
P2_1 | P3_112 | P4_2mc | Pn-3m | C2/c | R3 | P-43m | I432 |
P222_1 | I-42d | I-4c2 | P6cc | P6_2 | P3_221 | P321 | Pca2_1 |
I4_1/acd | I4_132 | F432 | Pna2_1 | Ccce | Ibca | P4/mnc | I4_1md |
P2_12_12_1 | R-3c | I2_13 | P-4m2 | Pm-3n | I4mm | F-43m | Pnnm |
P-42_1c | Cmmm | P6mm | P4_2cm | P4_2/m | Im-3m | Fm-3c | I4_1 |
P4cc | Cmme |
Responses
The response is a JSON object:
{ "cifs": [ { "input": {"comp": "Ba2Mn1Cr1", "sg": "R-3m", "z": 3}, "generated": "data_Ba6Mn3Cr3\n_symmetry...", "valid": True, "messages": [None], "fe": 0.8906367421150208 } ] }
The response contains a single property, cifs
, which is an array of generated CIFs. For now,
the array will always contain a single item. The item contains the contents of the generated CIF in the
generated
property. The valid
property states whether the generated CIF is
valid, and the messages
property contains a list of messages related to the generation (e.g.
why the generation may have been considered invalid). Note that the fe
property will only be
present if the predicted formation energy per atom was requested.
If there was an error processing the request, the API will return either a 400 or 500 HTTP response status code, and the body of the response will look something like:
{"error": "'model' property is missing"}
depending on the nature of the error.
Note that currently the API will take anywhere from several seconds to 1 or 2 minutes to respond to a single request (depending on the model type and the message options). Since the API is synchronous, clients should be prepared to wait a sufficient amount of time for a response.
Pricing *
- Small Model: $0.005 per request
- Large Model: $0.01 per request
Obtaining an API Key
If you are interested in accessing CrystaLLM through this API, please contact us for an API key.