Tutorials
Introduction to the Molecular Attention Transformer. ¶
In this tutorial we will learn more about the Molecular Attention Transformer, or MAT. MAT is a model based on transformers, aimed towards performing molecular prediction tasks. MAT is easy to tune and performs quite well relative to other molecular prediction tasks. The weights from MAT are chemically interpretable, thus making the model quite useful.
Reference Paper: Molecular Attention Transformer, Maziarka et. al.
In this tutorial, we will explore how to train MAT, and predict hydration enthalpy values for molecules from the freesolv hydration enthalpy dataset with MAT.
Colab ¶
This tutorial and the rest in this sequence are designed to be done in Google colab. If you'd like to open this notebook in colab, you can use the following link.
!pip install --pre deepchem
Import required modules ¶
import deepchem as dc
from deepchem.models.torch_models import MATModel
from deepchem.feat import MATFeaturizer
import matplotlib.pyplot as plt
wandb: WARNING W&B installed but not logged in. Run `wandb login` or set the WANDB_API_KEY env variable. wandb: WARNING W&B installed but not logged in. Run `wandb login` or set the WANDB_API_KEY env variable.
Molecule Featurization using MATFeaturizer ¶
MATFeaturizer is the featurizer intended to be used with the Molecular Attention Transformer, or MAT in short. MATFeaturizer takes a smile string or molecule as input and returns a MATEncoding dataclass object, which contains 3 numpy arrays: Node features matrix, adjacency matrix and distance matrix.
featurizer = dc.feat.MATFeaturizer()
# Let us now take an example array of smile strings and featurize it.
smile_string = ["CCC"]
output = featurizer.featurize(smile_string)
print(type(output[0]))
print(output[0].node_features)
print(output[0].adjacency_matrix)
print(output[0].distance_matrix)
<class 'deepchem.feat.molecule_featurizers.mat_featurizer.MATEncoding'> [[1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]] [[0. 0. 0. 0.] [0. 0. 0. 1.] [0. 0. 0. 1.] [0. 1. 1. 0.]] [[1.e+06 1.e+06 1.e+06 1.e+06] [1.e+06 0.e+00 2.e+00 1.e+00] [1.e+06 2.e+00 0.e+00 1.e+00] [1.e+06 1.e+00 1.e+00 0.e+00]]
Getting the Freesolv Hydration enthalpy dataset ¶
We will now acquire the Freesolv Hydration enthalpy dataset from MoleculeNet. If it already exists in the directory, the file will be used. Else, deepchem will automatically download the dataset from its AWS bucket.
tasks, dataset, transformers = dc.molnet.load_freesolv()
train_dataset, val_dataset, test_dataset = dataset
train_smiles = train_dataset.ids
val_smiles = val_dataset.ids
train_dataset
<DiskDataset X.shape: (513,), y.shape: (513, 1), w.shape: (513, 1), ids: ['CCCCNCCCC' 'CCOC=O' 'CCCCCCCCC' ... 'COC' 'CCCCCCCCBr' 'CCCc1ccc(c(c1)OC)O'], task_names: ['y']>
Training the model ¶
Now that we have acquired the dataset to be used and made the necessary imports, we will be invoking the Molecular Attention Transformer in deepchem, called MATModel, and we will be training it. We will be using default parameters for the purposes of this tutorial, however they can be changed anytime according to the user's preferences.
device = 'cpu'
model = MATModel(device = device)
losses, val_losses = [], []
%%time
max_epochs = 10
for epoch in range(max_epochs):
loss = model.fit(train_dataset, nb_epoch = 1, max_checkpoints_to_keep = 1, all_losses = losses)
metric = dc.metrics.Metric(dc.metrics.score_function.rms_score)
val_losses.append(model.evaluate(val_dataset, metrics = [metric])['rms_score']**2)
# The warnings are not relevant to this tutorial thus we can safely skip them.
/home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:165: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). node_features = torch.tensor(data[0]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:166: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). adjacency_matrix = torch.tensor(data[1]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:167: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). distance_matrix = torch.tensor(data[2]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/layers.py:171: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). torch.sum(torch.tensor(adj_matrix), dim=-1).unsqueeze(2) + eps) /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/layers.py:178: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). distance_matrix = torch.tensor(distance_matrix).squeeze().masked_fill( /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:165: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). node_features = torch.tensor(data[0]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:166: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). adjacency_matrix = torch.tensor(data[1]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:167: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). distance_matrix = torch.tensor(data[2]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/layers.py:171: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). torch.sum(torch.tensor(adj_matrix), dim=-1).unsqueeze(2) + eps) /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/layers.py:178: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). distance_matrix = torch.tensor(distance_matrix).squeeze().masked_fill( /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:165: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). node_features = torch.tensor(data[0]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:166: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). adjacency_matrix = torch.tensor(data[1]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:167: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). distance_matrix = torch.tensor(data[2]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/layers.py:171: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). torch.sum(torch.tensor(adj_matrix), dim=-1).unsqueeze(2) + eps) /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/layers.py:178: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). distance_matrix = torch.tensor(distance_matrix).squeeze().masked_fill( /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:165: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). node_features = torch.tensor(data[0]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:166: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). adjacency_matrix = torch.tensor(data[1]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:167: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). distance_matrix = torch.tensor(data[2]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/layers.py:171: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). torch.sum(torch.tensor(adj_matrix), dim=-1).unsqueeze(2) + eps) /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/layers.py:178: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). distance_matrix = torch.tensor(distance_matrix).squeeze().masked_fill( /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:165: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). node_features = torch.tensor(data[0]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:166: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). adjacency_matrix = torch.tensor(data[1]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:167: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). distance_matrix = torch.tensor(data[2]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/layers.py:171: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). torch.sum(torch.tensor(adj_matrix), dim=-1).unsqueeze(2) + eps) /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/layers.py:178: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). distance_matrix = torch.tensor(distance_matrix).squeeze().masked_fill( /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:165: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). node_features = torch.tensor(data[0]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:166: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). adjacency_matrix = torch.tensor(data[1]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:167: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). distance_matrix = torch.tensor(data[2]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/layers.py:171: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). torch.sum(torch.tensor(adj_matrix), dim=-1).unsqueeze(2) + eps) /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/layers.py:178: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). distance_matrix = torch.tensor(distance_matrix).squeeze().masked_fill( /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:165: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). node_features = torch.tensor(data[0]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:166: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). adjacency_matrix = torch.tensor(data[1]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:167: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). distance_matrix = torch.tensor(data[2]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/layers.py:171: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). torch.sum(torch.tensor(adj_matrix), dim=-1).unsqueeze(2) + eps) /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/layers.py:178: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). distance_matrix = torch.tensor(distance_matrix).squeeze().masked_fill( /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:165: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). node_features = torch.tensor(data[0]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:166: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). adjacency_matrix = torch.tensor(data[1]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:167: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). distance_matrix = torch.tensor(data[2]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/layers.py:171: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). torch.sum(torch.tensor(adj_matrix), dim=-1).unsqueeze(2) + eps) /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/layers.py:178: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). distance_matrix = torch.tensor(distance_matrix).squeeze().masked_fill( /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:165: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). node_features = torch.tensor(data[0]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:166: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). adjacency_matrix = torch.tensor(data[1]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:167: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). distance_matrix = torch.tensor(data[2]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/layers.py:171: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). torch.sum(torch.tensor(adj_matrix), dim=-1).unsqueeze(2) + eps) /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/layers.py:178: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). distance_matrix = torch.tensor(distance_matrix).squeeze().masked_fill( /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:165: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). node_features = torch.tensor(data[0]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:166: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). adjacency_matrix = torch.tensor(data[1]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:167: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). distance_matrix = torch.tensor(data[2]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/layers.py:171: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). torch.sum(torch.tensor(adj_matrix), dim=-1).unsqueeze(2) + eps) /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/layers.py:178: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). distance_matrix = torch.tensor(distance_matrix).squeeze().masked_fill(
CPU times: user 20min 48s, sys: 10.7 s, total: 20min 58s Wall time: 2min 41s
f, ax = plt.subplots()
ax.scatter(range(len(losses)), losses, label='train loss')
ax.scatter(range(len(val_losses)), val_losses, label='val loss')
plt.legend(loc='upper right');
Testing the model ¶
Optimally, MAT should be trained for a lot more epochs with a GPU. Due to computational constraints, we train this model for very few epochs in this tutorial. Let us now see how to predict the hydration enthalpy values for molecues now with MAT.
# We will be predicting the enthalpy value for the smile string we featurized earlier in the MATFeaturizer section.
model.predict_on_batch(output)
/home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:165: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). node_features = torch.tensor(data[0]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:166: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). adjacency_matrix = torch.tensor(data[1]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/mat.py:167: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). distance_matrix = torch.tensor(data[2]).float() /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/layers.py:171: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). torch.sum(torch.tensor(adj_matrix), dim=-1).unsqueeze(2) + eps) /home/atreyamaj/Desktop/deepchem/deepchem/models/torch_models/layers.py:178: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). distance_matrix = torch.tensor(distance_matrix).squeeze().masked_fill(
array([[-0.32668447]], dtype=float32)