.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto/case_studies/feature_engineering/eda/notebook.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_case_studies_feature_engineering_eda_notebook.py: Flyte Pipeline in One Jupyter Notebook ======================================= In this example, we will implement a simple pipeline that takes hyperparameters, does EDA, feature engineering, and measures the Gradient Boosting model's performance using mean absolute error (MAE), all in one notebook. .. GENERATED FROM PYTHON SOURCE LINES 10-11 First, let's import the libraries we will use in this example. .. GENERATED FROM PYTHON SOURCE LINES 11-17 .. code-block:: default import os import pathlib from flytekit import Resources, kwtypes, workflow from flytekitplugins.papermill import NotebookTask .. GENERATED FROM PYTHON SOURCE LINES 18-34 We define a ``NotebookTask`` to run the `Jupyter notebook `__. .. list-table:: ``NotebookTask`` Parameters :widths: 25 25 * - ``notebook_path`` - Path to the Jupyter notebook file * - ``inputs`` - Inputs to be sent to the notebook * - ``outputs`` - Outputs to be returned from the notebook * - ``requests`` - Specify compute resource requests for your task. This notebook returns ``mae_score`` as the output. .. GENERATED FROM PYTHON SOURCE LINES 34-50 .. code-block:: default nb = NotebookTask( name="pipeline-nb", notebook_path=os.path.join( pathlib.Path(__file__).parent.absolute(), "supermarket_regression.ipynb" ), inputs=kwtypes( n_estimators=int, max_depth=int, max_features=str, min_samples_split=int, random_state=int, ), outputs=kwtypes(mae_score=float), requests=Resources(mem="500Mi"), ) .. GENERATED FROM PYTHON SOURCE LINES 51-52 Since a task need not be defined, we create a ``workflow`` and return the MAE score. .. GENERATED FROM PYTHON SOURCE LINES 52-72 .. code-block:: default @workflow def notebook_wf( n_estimators: int = 150, max_depth: int = 3, max_features: str = "sqrt", min_samples_split: int = 4, random_state: int = 2, ) -> float: output = nb( n_estimators=n_estimators, max_depth=max_depth, max_features=max_features, min_samples_split=min_samples_split, random_state=random_state, ) return output.mae_score .. GENERATED FROM PYTHON SOURCE LINES 73-74 We can now run the notebook locally. .. GENERATED FROM PYTHON SOURCE LINES 74-76 .. code-block:: default if __name__ == "__main__": print(notebook_wf()) .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 0.000 seconds) .. _sphx_glr_download_auto_case_studies_feature_engineering_eda_notebook.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: notebook.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: notebook.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_