DolphinScheduler: PYTHON_HOME Modification

DolphinScheduler is an open-source distributed workflow scheduler platform. It supports various job types, such as shell, SQL, and Python. In this article, we will focus on configuring the PYTHON_HOME environment variable in DolphinScheduler.

Python is widely used for data processing, machine learning, and other tasks in the field of data engineering. DolphinScheduler provides a convenient way to run Python scripts as part of a workflow. However, to ensure that the Python scripts run correctly, it is essential to configure the PYTHON_HOME environment variable.

The PYTHON_HOME environment variable specifies the directory where Python is installed on the DolphinScheduler server. This allows DolphinScheduler to locate the Python interpreter and execute Python scripts.

To modify the PYTHON_HOME in DolphinScheduler, you need to follow these steps:

Step 1: Locate the configuration file The configuration file for DolphinScheduler is located in the conf directory under the DolphinScheduler installation directory. The file name is dolphinscheduler-env.sh.

Step 2: Open the configuration file You can use any text editor to open the dolphinscheduler-env.sh file. For example, you can use the vi command:

vi dolphinscheduler-env.sh

Step 3: Locate the PYTHON_HOME configuration In the dolphinscheduler-env.sh file, you will find a line that defines the PYTHON_HOME variable. By default, it is set to /usr/local/python. You need to change it to the actual directory where Python is installed on your server. For example, if Python is installed in /usr/bin/python3, you should modify the line to:

PYTHON_HOME=/usr/bin/python3

Step 4: Save and exit After modifying the dolphinscheduler-env.sh file, save it and exit the text editor.

Step 5: Restart DolphinScheduler To apply the changes, you need to restart the DolphinScheduler service. You can use the following command to restart DolphinScheduler:

cd /opt/dolphinscheduler/bin
./dolphinscheduler-daemon.sh stop all
./dolphinscheduler-daemon.sh start all

After following these steps, DolphinScheduler will use the specified PYTHON_HOME directory to execute Python scripts in workflows.

It's important to note that the PYTHON_HOME configuration should point to the directory that contains the python executable. If the directory contains multiple Python versions, you may need to specify the complete path to the desired Python version.

Here is a flowchart illustrating the process of modifying the PYTHON_HOME in DolphinScheduler:

flowchart TD
A[Locate the configuration file] --> B[Open the configuration file]
B --> C[Locate the PYTHON_HOME configuration]
C --> D[Modify the PYTHON_HOME]
D --> E[Save and exit]
E --> F[Restart DolphinScheduler]

In conclusion, configuring the PYTHON_HOME environment variable in DolphinScheduler is essential for running Python scripts as part of workflows. By following the steps outlined in this article, you can easily modify the PYTHON_HOME and ensure that DolphinScheduler can locate and execute Python scripts correctly.

Note: It is recommended to consult the DolphinScheduler documentation for the specific version you are using, as the steps may vary slightly.