5

I was trying to install and configure apache airflow on dev Hadoop cluster of a three nodes with below configurations/version:

Operating System: Red Hat Enterprise Linux Server 7.7
python 3.7.3
anaconda 2
spark 2.45

a)sudo yum install gcc gcc-c++ -y
b)sudo yum install libffi-devel mariadb-devel cyrus-sasl-devel -y
c)pip install 'apache-airflow[all]'
d)airflow initdb  -- airflow.cfgfile was created with SQLlite

Then I followed below set of commands to configure it with mysql

a) rpm -Uvh https://repo.mysql.com/mysql80-community-release-el7-3.noarch.rpm 
b) sed -i 's/enabled=1/enabled=0/' /etc/yum.repos.d/mysql-community.repo 
c) yum --enablerepo=mysql80-community install mysql-community-server 
d) systemctl start mysqld.service

Done below things at mysql

a) CREATE DATABASE airflow CHARACTER SET utf8 COLLATE utf8_unicode_ci; 
b) create user 'airflow'@'localhost' identified by 'Airflow123'; 
c) grant all privileges on * . * to 'airflow'@'localhost'; 

here are some details from my airflow.cfg file

broker_url = sqla+mysql://airflow:airflow@localhost:3306/airflow
result_backend = db+mysql://airflow:airflow@localhost:3306/airflow
sql_alchemy_conn = mysql://airflow:Airflow123@localhost:3306/airflow
executor = CeleryExecutor

I am getting below error while running airflow initdb commands

ImportError: /home/xyz/anaconda2/envs/python3.7.2/lib/python3.7/site-packages/_mysql.cpython-37m-x86_64-linux-gnu.so: symbol mysql_real_escape_string_quote, 
version libmysqlclient_18 not defined in file libmysqlclient.so.18 with link time reference

have set up the .bashrc file as:

export AIRFLOW_HOME=~/airflow

here's my directory created:

[xyz@innolx5984 airflow]$ pwd
/home/xyz/airflow

When I look for this file "libmysqlclient" I have found these many instances.

[xyz@innolx5984 airflow]$ find /home/xyz/ -name "*libmysqlclient*"
/home/xyz/anaconda2/pkgs/mysql-connector-c-6.1.11-h597af5e_1/lib/libmysqlclient.so
/home/xyz/anaconda2/pkgs/mysql-connector-c-6.1.11-h597af5e_1/lib/libmysqlclient.a
/home/xyz/anaconda2/pkgs/mysql-connector-c-6.1.11-h597af5e_1/lib/libmysqlclient.so.18
/home/xyz/anaconda2/pkgs/mysql-connector-c-6.1.11-h597af5e_1/lib/libmysqlclient.so.18.4.0
/home/xyz/anaconda2/lib/libmysqlclient.a
/home/xyz/anaconda2/lib/libmysqlclient.so
/home/xyz/anaconda2/lib/libmysqlclient.so.18
/home/xyz/anaconda2/lib/libmysqlclient.so.18.4.0

Just adding few more details in case it helps.

[xyz@innolx5984 airflow]$ mysql_config
Usage: /home/xyz/an
aconda2/bin/mysql_config [OPTIONS]
Options:
        --cflags         [-I/home/xyz/anaconda2/include ]
        --cxxflags       [-I/home/xyz/anaconda2/include ]
        --include        [-I/home/xyz/anaconda2/include]
        --libs           [-L/home/xyz/anaconda2/lib -lmysqlclient ]
        --libs_r         [-L/home/xyz/anaconda2/lib -lmysqlclient ]
        --plugindir      [/home/xyz`/anaconda2/lib/plugin]
        --socket         [/tmp/mysql.sock]
        --port           [0]
        --version        [6.1.11]
        --variable=VAR   VAR is one of:
                pkgincludedir [/home/xyz/anaconda2/include]
                pkglibdir     [/home/xyz/anaconda2/lib]
                plugindir     [/home/xyz/anaconda2/lib/plugin]

    Looking for some help and suggestion to resolve this

issue. I am not too sure whether heading into right direction.

8
  • 1
    Suspecting, mariadb causing this issue. Try yum install python3-devel mysql-devel; pip install mysqlclient Commented Apr 2, 2020 at 9:53
  • we ran this and then airflow initdb.. it's not working and throwing same error. Commented Apr 2, 2020 at 10:07
  • yum remove mariadb-devel Commented Apr 2, 2020 at 10:18
  • it throwing now No Match for argument: mariadb-devel Commented Apr 2, 2020 at 10:28
  • 1
    Okay, let me try to reproduce the issue! Commented Apr 6, 2020 at 5:30

1 Answer 1

4
+100

Follow these steps to install Apache Airflow with MySQL using Anaconda3

1) Install Pre-requisites

yum install gcc gcc-c++ -y
yum install libffi-devel mariadb-devel cyrus-sasl-devel -y
dnf install redhat-rpm-config

2) Install Anaconda3 (comes with Python 3.7.6)

yum install libXcomposite libXcursor libXi libXtst libXrandr alsa-lib mesa-libEGL libXdamage mesa-libGL libXScrnSaver
wget https://repo.anaconda.com/archive/Anaconda3-2020.02-Linux-x86_64.sh
chmod +x Anaconda3-2020.02-Linux-x86_64.sh 
./Anaconda3-2020.02-Linux-x86_64.sh

Make sure you do conda initialize when prompted during installation. This will make sure the correct version of python and pip are used in the subsequent steps.

3) Install Apache Airflow

pip install apache-airflow[mysql,celery]

You can add other subpackages as required. I have included only the ones required for Airflow to use MySQL database as backend.

4) Initialize Airflow

export AIRFLOW_HOME=~/airflow
airflow initdb

From here, I have mimicked the steps you have followed to configure MySQL Server

5) Install MySQL Server

rpm -Uvh https://repo.mysql.com/mysql80-community-release-el7-3.noarch.rpm 
sed -i 's/enabled=1/enabled=0/' /etc/yum.repos.d/mysql-community.repo 
yum --enablerepo=mysql80-community install mysql-server 
systemctl start mysqld.service

6) Login to MySQL and configure database for Airflow

mysql> CREATE DATABASE airflow CHARACTER SET utf8 COLLATE utf8_unicode_ci; 
mysql> CREATE user 'airflow'@'localhost' identified by 'Airflow123'; 
mysql> GRANT ALL privileges on *.* to 'airflow'@'localhost'; 

7) Update Airflow configuration file (~/airflow/airflow.cfg)

sql_alchemy_conn = mysql://airflow:Airflow123@localhost:3306/airflow
executor = CeleryExecutor

8) Initialize Airflow

airflow initdb
Sign up to request clarification or add additional context in comments.

8 Comments

Many Thanks for your kind and humble support. I am accepting this as an answer. Looks like I have to upgrade it to Anaconda3 version. The reason I switched to Anaconda2 version was that I was not able to find hdfs3 module with python 3.7.6 version. will see that later on.
Sure, let me know if you have any issues on going further.
I have followed the same process but same Import error and not able to run these commands dnf install redhat-rpm-config & mysql_config. it says command not found
May be some of the old libraries did not get removed properly. Comment if any issue pops up.
Great! Finally solved. When I tried to reproduce it was a fresh installation, thus it went smooth without any hiccups.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.