当前位置：首页 >

mlflow_在生产中设置MLflow

发布时间：2023/12/15 47 豆豆

生活随笔收集整理的这篇文章主要介绍了 mlflow_在生产中设置MLflow 小编觉得挺不错的,现在分享给大家,帮大家做个参考.

mlflow

This is the first article in my MLflow tutorial series:

这是我的MLflow教程系列的第一篇文章：

Setup MLflow in Production (you are here!)

在生产环境中设置MLflow (您在这里！)

MLflow: Basic logging functions

MLflow：基本日志记录功能

MLflow logging for TensorFlow

TensorFlow的MLflow日志记录

MLflow Projects

MLflow项目

Retrieving the best model using Python API for MLflow

使用适用于MLflow的Python API检索最佳模型

Serving a model using MLflow

使用MLflow服务模型

MLflow is an open-source platform for machine learning lifecycle management. Recently, I set up MLflow in production with a Postgres database as a Tracking Server and SFTP for the transfer of artifacts over the network. It took me about 2 weeks to get all the components right but this post would help you setup of MLflow in a production environment in about 10 minutes.

MLflow是用于机器学习生命周期管理的开源平台。最近，我在生产中设置了MLflow，并使用Postgres数据库作为Tracking Server和SFTP来通过网络传输工件。我花了大约2周的时间才能正确安装所有组件，但是这篇文章将帮助您在大约10分钟的生产环境中设置MLflow。

要求 (Requirements)

Anaconda
水蟒

跟踪服务器设置 (Tracking Server Setup)

Tracking Server stores the metadata that you see in the MLflow UI. First, let’s create a new Conda environment:

跟踪服务器存储您在MLflow UI中看到的元数据。首先，让我们创建一个新的Conda环境：

conda create -n mlflow_env
conda activate mlflow_env

Install the MLflow and PySFTP libraries:

安装MLflow和PySFTP库：

conda install python
pip install mlflow
pip install pysftp

Our Tracking Server uses a Postgres database as a backend for storing the metadata. So let’s install PostgreSQL:

我们的Tracking Server使用Postgres数据库作为后端来存储元数据。因此，让我们安装PostgreSQL：

apt-get install postgresql postgresql-contrib postgresql-server-dev-all

Next, we will create the admin user and a database for the Tracking Server

接下来，我们将为跟踪服务器创建管理员用户和数据库

sudo -u postgres psql

In the psql console:

在psql控制台中：

CREATE DATABASE mlflow_db;
CREATE USER mlflow_user WITH ENCRYPTED PASSWORD 'mlflow';
GRANT ALL PRIVILEGES ON DATABASE mlflow_db TO mlflow_user;

As we’ll need to interact with Postgres from Python, it is needed to install the psycopg2 library. However, to ensure a successful installation we need to install the GCC Linux package before:

由于我们需要与Python中的Postgres进行交互，因此需要安装psycopg2库。但是，为了确保安装成功，我们需要在安装GCC Linux软件包之前：

sudo apt install gcc
pip install psycopg2-binary

If you would like to connect to the PostgreSQL Server remotely or would like to give its access to the users. You can

如果您想远程连接到PostgreSQL服务器或希望将其访问权限授予用户。您可以

cd /var/lib/pgsql/data

Then add the following line at the end of the postgresql.conf file.

然后在postgresql.conf文件的末尾添加以下行。

listen_addresses = '*'

You can then specify a remote IP from which you want to allow connection to the PostgreSQL Server, by adding the following line at the end of the pg_hba.conf file

然后，您可以通过在pg_hba.conf文件末尾添加以下行来指定要允许从其连接到PostgreSQL服务器的远程IP。

host all all 10.10.10.187/32 trust

where 10.10.10.187/32 is the remote IP. To allow connection from any IP, use 0.0.0.0/0 instead. Then restart the PostgreSQL Server to apply the changes.

其中10.10.10.187/32是远程IP。要允许来自任何IP的连接，请改用0.0.0.0/0 。然后重新启动PostgreSQL服务器以应用更改。

service postgresql restart

The next step is creating a directory for our Tracking Server to log the Machine Learning models and other artifacts. Remember that the Postgres database is only used for storing metadata regarding those models. This directory is called artifact URI.

下一步是为Tracking Server创建一个目录，以记录机器学习模型和其他工件。请记住，Postgres数据库仅用于存储有关那些模型的元数据。此目录称为工件URI。

mkdir ~/mlflow/mlruns

Create a logging directory.

创建一个日志目录。

mkdir ~/mlflow/mllogs

You can run the Tracking Server with the following command. But as soon as you do Ctrl-C or exit the terminal the server stops.

您可以使用以下命令运行Tracking Server。但是，一旦您执行Ctrl-C或退出终端，服务器就会停止。

mlflow server --backend-store-uri postgresql://mlflow_user:mlflow@localhost/mlflow_db --default-artifact-root sftp://mlflow_user@<hostname_of_server>:~/mlflow/mlruns -h 0.0.0.0 -p 8000

If you want the Tracking server to be up and running after restarts and be resilient to failures, it is very useful to run it as a systemd service.

如果您希望跟踪服务器在重新启动后能够启动并运行，并且能够对故障进行恢复，那么将其作为systemd服务运行非常有用。

You need to go into the /etc/systemd/system directory and create a new file called mlflow-tracking.service with the following content:

您需要进入/ etc / systemd / system目录，并创建一个名为mlflow-tracking.service的新文件，其内容如下：

[Unit]
Description=MLflow Tracking Server
After=network.target[Service]
Restart=on-failure
RestartSec=30
StandardOutput=file:/path_to_your_logging_folder/stdout.log
StandardError=file:/path_to_your_logging_folder/stderr.log
User=root
ExecStart=/bin/bash -c 'PATH=/path_to_your_conda_installation/envs/mlflow_env/bin/:$PATH exec mlflow server --backend-store-uri postgresql://mlflow_user:mlflow@localhost/mlflow_db --default-artifact-root sftp://mlflow_user@<hostname_of_server>:~/mlflow/mlruns -h 0.0.0.0 -p 8000'[Install]
WantedBy=multi-user.target

Activate and enable the above service with the following commands:

使用以下命令激活并启用上述服务：

sudo systemctl daemon-reload
sudo systemctl enable mlflow-tracking
sudo systemctl start mlflow-tracking

Check that everything worked as expected with the following command:

使用以下命令检查一切是否按预期进行：

sudo systemctl status mlflow-tracking

You should see an output similar to this:

您应该看到类似于以下的输出：

Systemd unit running系统单元运行

Create user for the server named mlflow_user and make mlflow directory as the working directory for this user. Then create an ssh-key pair in the .ssh directory for the mlflow_user (/mlflow/.ssh in our case). Put the public key in the authorized_keys file and share the private key with the users.

为名为mlflow_user的服务器创建用户，并将mlflow目录作为该用户的工作目录。然后创建.ssh目录中的SSH密钥对的mlflow_user(/mlflow/.ssh在我们的例子)。将公钥放入authorized_keys文件中，并与用户共享私钥。

Additionally, for the MLflow UI to be able to read the artifacts, copy the private key to /root/.ssh/ as well.

另外，为了使MLflow UI能够读取工件，请将私钥也复制到/root/.ssh/ 。

Next, we need to create the Host Key for the server manually using this command:

接下来，我们需要使用以下命令为服务器手动创建主机密钥：

cd /root/.ssh
ssh-keyscan -H <hostname_of_server> >> known_hosts

You can now restart the machine and the MLflow Tracking Server will be up and running after this restart.

您现在可以重新启动计算机，并且重新启动后MLflow Tracking Server将启动并运行。

在客户端计算机上 (On the client machines)

In order to start tracking everything under the production Tracking Server, it is necessary to set the following environment variable in your .bashrc.

为了开始跟踪生产跟踪服务器下的所有内容，必须在.bashrc中设置以下环境变量。

export MLFLOW_TRACKING_URI='http://<hostname_of_server>:8000'

Do not forget to source your .bashrc file!

不要忘记提供您的.bashrc文件！

. ~/.bashrc

Make sure you install pip packages for mlflow and pysftp in your environment (pysftp is required to facilitate the transfer of artifacts to the production server).

确保在您的环境中安装了用于mlflow和pysftp的 pip软件包(需要pysftp才能将工件传输到生产服务器)。

pip install mlflow
pip install pysftp

To be able to authenticate the pysftp transfers, put the private key generated on the Production Server in the .ssh directory of your local machine . Then do

为了能够验证pysftp传输，请将生产服务器上生成的私钥放在本地计算机的.ssh目录中。然后做

ssh <hostname_of_server>

When prompted to save <hostname_of_server> as a known host, answer yes.

当提示您将<hostname_of_server>保存为已知主机时，请回答yes 。

You can access MLflow UI at http://<hostname_of_server>:8000

您可以通过以下网址访问MLflow UI ： http：// <hostname_of_server> ：8000

The Mlflow UIMlflow用户界面

Run a sample machine learning model from the internet to check whether MLflow can track the runs.

从Internet运行示例机器学习模型以检查MLflow是否可以跟踪运行。

mlflow run git@github.com:databricks/mlflow-example.git -P alpha=0.5

In the next post, I’ll speak about basic MLflow logging functions

在下一篇文章中，我将介绍基本的MLflow日志记录功能

翻译自: https://towardsdatascience.com/setup-mlflow-in-production-d72aecde7fef

mlflow

总结

以上是生活随笔为你收集整理的mlflow_在生产中设置MLflow的全部内容，希望文章能够帮你解决所遇到的问题。

如果觉得生活随笔网站内容还不错，欢迎将生活随笔推荐给好友。

MLFlow

上一篇：谁是集换卡牌的真神 —— 宝可梦集换式卡
下一篇：神秘实体ALIMA