Package: sl3 1.4.5

Jeremy Coyle

sl3: Pipelines for Machine Learning and Super Learning

A modern implementation of the Super Learner prediction algorithm, coupled with a general purpose framework for composing arbitrary pipelines for machine learning tasks.

Authors:Jeremy Coyle [aut, cre, cph], Nima Hejazi [aut], Oleg Sofrygin [aut], Ivana Malenica [aut], Rachael Phillips [aut], Weixin Cai [ctb], Yulun Wu [ctb], Hugh Jiang [ctb]

sl3_1.4.5.tar.gz
sl3_1.4.5.zip(r-4.5)sl3_1.4.5.zip(r-4.4)sl3_1.4.5.zip(r-4.3)
sl3_1.4.5.tgz(r-4.4-any)sl3_1.4.5.tgz(r-4.3-any)
sl3_1.4.5.tar.gz(r-4.5-noble)sl3_1.4.5.tar.gz(r-4.4-noble)
sl3_1.4.5.tgz(r-4.4-emscripten)sl3_1.4.5.tgz(r-4.3-emscripten)
sl3.pdf |sl3.html
sl3/json (API)
NEWS

# Install 'sl3' in R:
install.packages('sl3', repos = c('https://ictml-project.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/tlverse/sl3/issues

Datasets:
  • bsds - Bicycle sharing time series dataset
  • cpp - Subset of growth data from the collaborative perinatal project
  • cpp_1yr - Subset of growth data from the collaborative perinatal project
  • cpp_imputed - Subset of growth data from the collaborative perinatal project
  • density_dat - Simulated data with continuous exposure

On CRAN:

data-scienceensemble-learningensemble-modelmachine-learningmodel-selectionregressionstackingstatistics

8.32 score 100 stars 748 scripts 130 exports 121 dependencies

Last updated 21 days agofrom:c87f76ca1e. Checks:OK: 1 ERROR: 6. Indexed: no.

TargetResultDate
Doc / VignettesOKOct 27 2024
R-4.5-winERROROct 27 2024
R-4.5-linuxERROROct 27 2024
R-4.4-winERROROct 27 2024
R-4.4-macERROROct 27 2024
R-4.3-winERROROct 27 2024
R-4.3-macERROROct 27 2024

Exports:args_to_listCustom_chaincustom_ROCR_riskcustomize_chaincv_sldebug_predictdebug_traindebugonce_predictdebugonce_traindefine_h2o_Xdelayed_learner_fit_chaindelayed_learner_fit_predictdelayed_learner_process_formuladelayed_learner_subset_covariatesdelayed_learner_traindelayed_make_learnerdt_expand_factorsfactor_to_indicatorsimportanceimportance_plotinverse_samplelearner_fit_chainlearner_fit_predictlearner_process_formulalearner_subset_covariateslearner_trainloss_loglik_binomialloss_loglik_multinomialloss_loglik_true_catloss_squared_errorloss_squared_error_multivariateLrnr_arimaLrnr_bartMachineLrnr_baseLrnr_bayesglmLrnr_boundLrnr_caretLrnr_cvLrnr_cv_selectorLrnr_dbartsLrnr_define_interactionsLrnr_density_discretizeLrnr_density_hseLrnr_density_semiparametricLrnr_earthLrnr_expSmoothLrnr_gaLrnr_gamLrnr_gbmLrnr_glmLrnr_glm_fastLrnr_glm_semiparametricLrnr_glmnetLrnr_glmtreeLrnr_grfLrnr_grfcateLrnr_gru_kerasLrnr_gtsLrnr_h2o_classifierLrnr_h2o_glmLrnr_h2o_gridLrnr_h2o_mutatorLrnr_hal9001Lrnr_haldensifyLrnr_HarmonicRegLrnr_htsLrnr_independent_binomialLrnr_lightgbmLrnr_lstm_kerasLrnr_meanLrnr_multiple_tsLrnr_multivariateLrnr_nnetLrnr_nnlsLrnr_optimLrnr_pcaLrnr_pkg_SuperLearnerLrnr_pkg_SuperLearner_methodLrnr_pkg_SuperLearner_screenerLrnr_polsplineLrnr_pooled_hazardsLrnr_randomForestLrnr_rangerLrnr_revere_taskLrnr_rpartLrnr_rugarchLrnr_screener_augmentLrnr_screener_coefsLrnr_screener_correlationLrnr_screener_importanceLrnr_slLrnr_solnpLrnr_solnp_densityLrnr_stratifiedLrnr_subset_covariatesLrnr_svmLrnr_ts_weightsLrnr_tsDynLrnr_xgboostmake_learnermake_learner_stackmake_sl3_Taskmetalearner_linearmetalearner_linear_multinomialmetalearner_linear_multivariatemetalearner_logistic_binomialpack_predictionsPipelinepooled_hazard_taskpredict_classesprediction_plotprocess_datarisksafe_dimShared_Datasl3_debug_modesl3_list_learnerssl3_list_propertiessl3_revere_Tasksl3_Tasksl3OptionsStacksubset_foldstrain_taskundebug_learnerunpack_predictionsvalidation_taskvariable_typeVariable_Typewrite_learner_template

Dependencies:abindassertthatbackportsbase64encBBmiscbitopsbslibcachemcaretcaToolscheckmateclasscliclockcodetoolscolorspacecpp11crayondata.tabledelayeddiagramdigestdplyre1071evaluatefansifarverfastmapfontawesomeforeachfsfuturefuture.applygenericsggplot2globalsgluegowergplotsgtablegtoolshardhathighrhmshtmltoolshtmlwidgetsigraphipredisobanditeratorsjquerylibjsonliteKernSmoothknitrlabelinglatticelavalifecyclelistenvlubridatemagrittrMASSMatrixmemoisemgcvmimeModelMetricsmunsellnlmennetnumDerivorigamiparallellypillarpkgconfigplyrprettyunitspROCprodlimprogressprogressrproxypurrrR.methodsS3R.ooR.utilsR6rappdirsrbibutilsRColorBrewerRcppRdpackrecipesreshape2rlangrmarkdownROCRrpartrstackdequesassscalesshapeSQUAREMstringistringrsurvivaltibbletidyrtidyselecttimechangetimeDatetinytextzdbutf8uuidvctrsviridisLitevisNetworkwithrxfunyaml

Defining New sl3 Learners

Rendered fromcustom_lrnrs.Rmdusingknitr::rmarkdownon Oct 27 2024.

Last update: 2024-04-29
Started: 2017-08-13

Modern Machine Learning in R

Rendered fromintro_sl3.Rmdusingknitr::rmarkdownon Oct 27 2024.

Last update: 2019-10-08
Started: 2017-08-13

Readme and manuals

Help Manual

Help pageTopics
Get all arguments of parent call (both specified and defaults) as listargs_to_list
Bicycle sharing time series datasetbsds
Subset of growth data from the collaborative perinatal project (CPP)cpp cpp_imputed
Subset of growth data from the collaborative perinatal project (CPP)cpp_1yr
Customize chaining for a learnercustomize_chain Custom_chain
Cross-validated Risk Estimationcv_risk
Cross-validated Super Learnercv_sl
Helper functions to debug sl3 Learnersdebugonce_predict debugonce_train debug_predict debug_train sl3_debug_mode undebug_learner
Automatically Defined Metalearnerdefault_metalearner
h2o Model Definitiondefine_h2o_X Lrnr_h2o_glm
Learner helpersdelayed_learner_fit_chain delayed_learner_fit_predict delayed_learner_process_formula delayed_learner_subset_covariates delayed_learner_train delayed_make_learner learner_fit_chain learner_fit_predict learner_process_formula learner_subset_covariates learner_train
Simulated data with continuous exposuredensity_dat
Convert Factors to indicatorsdt_expand_factors factor_to_indicators
Importance Extract variable importance measures produced by 'randomForest' and order in decreasing order of importance.importance
Variable Importance Plotimportance_plot
Inverse CDF Samplinginverse_sample
Loss Function Definitionsloss_functions loss_loglik_binomial loss_loglik_multinomial loss_loglik_true_cat loss_squared_error loss_squared_error_multivariate
Univariate ARIMA ModelsLrnr_arima
bartMachine: Bayesian Additive Regression Trees (BART)Lrnr_bartMachine
Base Class for all sl3 LearnersLrnr_base make_learner
Bayesian Generalized Linear ModelsLrnr_bayesglm
Bound PredictionsLrnr_bound
Caret (Classification and Regression) TrainingLrnr_caret
Fit/Predict a learner with Cross ValidationLrnr_cv
Cross-Validated SelectorLrnr_cv_selector
Discrete Bayesian Additive Regression Tree samplerLrnr_dbarts
Define interactions termsLrnr_define_interactions
Density from ClassificationLrnr_density_discretize
Density Estimation With Mean Model and Homoscedastic ErrorsLrnr_density_hse
Density Estimation With Mean Model and Homoscedastic ErrorsLrnr_density_semiparametric
Earth: Multivariate Adaptive Regression SplinesLrnr_earth
Exponential Smoothing state space modelLrnr_expSmooth
Nonlinear Optimization via Genetic Algorithm (GA)Lrnr_ga
GAM: Generalized Additive ModelsLrnr_gam
GBM: Generalized Boosted Regression ModelsLrnr_gbm
Generalized Linear ModelsLrnr_glm
Computationally Efficient Generalized Linear Model (GLM) FittingLrnr_glm_fast
Semiparametric Generalized Linear ModelsLrnr_glm_semiparametric
GLMs with Elastic Net RegularizationLrnr_glmnet
Generalized Linear Model TreesLrnr_glmtree
Generalized Random Forests LearnerLrnr_grf
Generalized Random Forests for Conditional Average Treatment EffectsLrnr_grfcate
Recurrent Neural Network with Gated Recurrent Unit (GRU) with KerasLrnr_gru_keras
Grouped Time-Series ForecastingLrnr_gts
Grid Search Models with h2oLrnr_h2o_classifier Lrnr_h2o_grid Lrnr_h2o_mutator
Scalable Highly Adaptive Lasso (HAL)Lrnr_hal9001
Conditional Density Estimation with the Highly Adaptive LASSOLrnr_haldensify
Harmonic RegressionLrnr_HarmonicReg
Hierarchical Time-Series ForecastingLrnr_hts
Classification from Binomial RegressionLrnr_independent_binomial
LightGBM: Light Gradient Boosting MachineLrnr_lightgbm
Long short-term memory Recurrent Neural Network (LSTM) with KerasLrnr_lstm_keras
Fitting Intercept ModelsLrnr_mean
Stratify univariable time-series learners by time-seriesLrnr_multiple_ts
Multivariate LearnerLrnr_multivariate
Feed-Forward Neural Networks and Multinomial Log-Linear ModelsLrnr_nnet
Non-negative Linear Least SquaresLrnr_nnls
Optimize Metalearner according to Loss Function using optimLrnr_optim
Principal Component Analysis and RegressionLrnr_pca
Use SuperLearner Wrappers, Screeners, and Methods, in sl3Lrnr_pkg_SuperLearner Lrnr_pkg_SuperLearner_method Lrnr_pkg_SuperLearner_screener
Polyspline - multivariate adaptive polynomial spline regression (polymars) and polychotomous regression and multiple classification (polyclass)Lrnr_polspline
Classification from Pooled HazardsLrnr_pooled_hazards
Random ForestsLrnr_randomForest
Ranger: Fast(er) Random ForestsLrnr_ranger
Learner that chains into a revere taskLrnr_revere_task
Learner for Recursive Partitioning and Regression TreesLrnr_rpart
Univariate GARCH ModelsLrnr_rugarch
Augmented Covariate ScreenerLrnr_screener_augment
Coefficient Magnitude ScreenerLrnr_screener_coefs
Correlation Screening ProceduresLrnr_screener_correlation
Variable Importance ScreenerLrnr_screener_importance
The Super Learner AlgorithmLrnr_sl
Nonlinear Optimization via Augmented LagrangeLrnr_solnp
Nonlinear Optimization via Augmented LagrangeLrnr_solnp_density
Stratify learner fits by a single variableLrnr_stratified
Learner with Covariate SubsettingLrnr_subset_covariates
Support Vector MachinesLrnr_svm
Time-specific weighting of prediction lossesLrnr_ts_weights
Nonlinear Time Series AnalysisLrnr_tsDyn
xgboost: eXtreme Gradient BoostingLrnr_xgboost
Make a stack of sl3 learnersmake_learner_stack
Combine predictions from multiple learnersmetalearners metalearner_linear metalearner_linear_multinomial metalearner_linear_multivariate metalearner_logistic_binomial
Pack multidimensional predictions into a vector (and unpack again)pack_predictions unpack_predictions
Pipeline (chain) of learners.Pipeline
Generate A Pooled Hazards Task from a Failure Time (or Categorical) Taskpooled_hazard_task
Predict Class from Predicted Probabilitiespredict_classes
Plot predicted and true values for diganostic purposesprediction_plot
Process Dataprocess_data
Risk Estimationrisk
FACTORY RISK FUNCTION FOR ROCR PERFORMANCE MEASURES WITH BINARY OUTCOMEScustom_ROCR_risk risk_functions
dim that works for vectors toosafe_dim
Container Class for data.table Shared Between TasksShared_Data
List sl3 Learnerssl3_list_learners sl3_list_properties
Revere (SplitSpecific) Tasksl3_revere_Task
Define a Machine Learning Taskmake_sl3_Task sl3_Task
Querying/setting a single 'sl3' optionsl3Options
Learner StackingStack
Make folds work on subset of datasubset_folds
Subset Tasks for CV THe functions use origami folds to subset tasks. These functions are used by Lrnr_cv (and therefore other learners that use Lrnr_cv). So that nested cv works properly, currently the subsetted task objects do not have fold structures of their own, and so generate them from defaults if nested cv is requested.train_task validation_task
Undocumented Learnerundocumented_learner
Specify Variable TypeVariable_Type variable_type
Generate a file containing a template 'sl3' Learnerwrite_learner_template