deepchem.models package

Subpackages

Submodules

deepchem.models.models module

Contains an abstract base class that supports different ML models.

class deepchem.models.models.Model(model_instance=None, model_dir=None, verbose=True, **kwargs)[source]

Bases: sklearn.base.BaseEstimator

Abstract base class for different ML models.

evaluate(dataset, metrics, transformers=[], per_task_metrics=False)[source]

Evaluates the performance of this model on specified dataset.

Parameters:
  • dataset (dc.data.Dataset) – Dataset object.
  • metric (deepchem.metrics.Metric) – Evaluation metric
  • transformers (list) – List of deepchem.transformers.Transformer
  • per_task_metrics (bool) – If True, return per-task scores.
Returns:

Maps tasks to scores under metric.

Return type:

dict

fit(dataset, nb_epoch=10, batch_size=50, **kwargs)[source]

Fits a model on data in a Dataset object.

fit_on_batch(X, y, w)[source]

Updates existing model with new information.

static get_model_filename(model_dir)[source]

Given model directory, obtain filename for the model itself.

get_num_tasks()[source]

Get number of tasks.

get_params(deep=True)

Get parameters for this estimator.

Parameters:deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:params – Parameter names mapped to their values.
Return type:mapping of string to any
static get_params_filename(model_dir)[source]

Given model directory, obtain filename for the model itself.

get_task_type()[source]

Currently models can only be classifiers or regressors.

predict(dataset, transformers=[], batch_size=None)[source]

Uses self to make predictions on provided Dataset object.

Returns:numpy ndarray of shape (n_samples,)
Return type:y_pred
predict_on_batch(X, **kwargs)[source]

Makes predictions on given batch of new data.

Parameters:X (np.ndarray) – Features
predict_proba(dataset, transformers=[], batch_size=None, n_classes=2)[source]

TODO: Do transformers even make sense here?

Returns:numpy ndarray of shape (n_samples, n_classes*n_tasks)
Return type:y_pred
predict_proba_on_batch(X)[source]

Makes predictions of class probabilities on given batch of new data.

Parameters:X (np.ndarray) – Features
reload()[source]

Reload trained model from disk.

save()[source]

Dispatcher function for saving.

Each subclass is responsible for overriding this method.

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns:
Return type:self

deepchem.models.multitask module

Convenience class that lets singletask models fit on multitask data.

class deepchem.models.multitask.SingletaskToMultitask(tasks, model_builder, model_dir=None, verbose=True)[source]

Bases: deepchem.models.models.Model

Convenience class to let singletask models be fit on multitask data.

Warning: This current implementation is only functional for sklearn models.

evaluate(dataset, metrics, transformers=[], per_task_metrics=False)

Evaluates the performance of this model on specified dataset.

Parameters:
  • dataset (dc.data.Dataset) – Dataset object.
  • metric (deepchem.metrics.Metric) – Evaluation metric
  • transformers (list) – List of deepchem.transformers.Transformer
  • per_task_metrics (bool) – If True, return per-task scores.
Returns:

Maps tasks to scores under metric.

Return type:

dict

fit(dataset, **kwargs)[source]

Updates all singletask models with new information.

Warning: This current implementation is only functional for sklearn models.

fit_on_batch(X, y, w)

Updates existing model with new information.

get_model_filename(model_dir)

Given model directory, obtain filename for the model itself.

get_num_tasks()

Get number of tasks.

get_params(deep=True)

Get parameters for this estimator.

Parameters:deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:params – Parameter names mapped to their values.
Return type:mapping of string to any
get_params_filename(model_dir)

Given model directory, obtain filename for the model itself.

get_task_type()

Currently models can only be classifiers or regressors.

predict(dataset, transformers=[])[source]

Prediction for multitask models.

predict_on_batch(X)[source]

Concatenates results from all singletask models.

predict_proba(dataset, transformers=[], n_classes=2)[source]

Concatenates results from all singletask models.

predict_proba_on_batch(X, n_classes=2)[source]

Concatenates results from all singletask models.

reload()[source]

Load all models

save()[source]

Save all models

TODO(rbharath): Saving is not yet supported for this model.

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns:
Return type:self

Module contents

Gathers all models in one place for convenient imports