Setting up direct Metastore access¶
Flamy can perform some actions such as metadata retrieval from Hive’s Metastore. It can perform many useful actions such as listing all the schemas, tables, or partitions with useful associated information. For this, Flamy can use either the Thrift client (HiveMetastoreClient) provided by Hive or directly connect to the metastore database via JDBC. While the first method works out of the box, it is often much slower than the second.
In order to grant direct access to the Hive Metastore for Flamy, you need to ask your favorite administrator to create a flamy user and grant it a read-only access to the following tables of the metastore (beware, for names are case-sensitive) :
Currently, flamy is only compatible with Metastore’s backed by PostgreSQL, MySQL or MariaDB. MySQL and MariaDB require the user to manually download the jdbc client jar, for license incompatibility reasons.
If you are not yourself an administrator, and want to perform the changes yourself on a development cluster, you may do so by following theses steps:
1. connect with ssh to the machine hosting the database’s metastore¶
(replace <MY_HOSTS> with the correct name)
2. connect to the database¶
If you have a Cloudera cluster with a postgresql database, you may retrieve the admin password with:
sudo cat /var/lib/cloudera-scm-server-db/data/generated_password.txt
and then connect with:
psql -h localhost -p 7432 hive cloudera-scm
Otherwise please refer to your SQL back end’s documentation.
3. Once connected to the database, create a new user called flamy¶
(replace <FlamyPassword> with the strong password of your choice):
CREATE USER flamy WITH PASSWORD '<FlamyPassword>';
4. And grant read-only access to flamy to the following tables¶
GRANT SELECT ON TABLE "PARTITIONS", "TBLS", "DBS", "SDS", "COLUMNS_V2", "TABLE_PARAMS", "PARTITION_PARAMS" TO flamy ;
5. Change Flamy’s configuration to enable direct access to the Metastore¶
Edit conf/flamy.properties and add or change the line
flamy.env.<ENVIRONMENT>.hive.meta.fetcher.type = direct
for the specific environment you want to connect to (eg: dev, prod, etc.)
You also need to fill the following properties:
flamy.env.<ENVIRONMENT>.hive.metastore.jdbc.uri = ... flamy.env.<ENVIRONMENT>.hive.metastore.jdbc.user = flamy flamy.env.<ENVIRONMENT>.hive.metastore.jdbc.password = ...
For instance, if the metastore is using postgresql, the jdbc uri shall be of this form
If using Cloudera, <PORT> might be 7432, but we recommend checking on your parameters.
6. Try connecting to the database¶
by running the following command:
bin/flamy show schemas --on <ENVIRONMENT>
7. (Optional) If the metastore is using *postgresql*, you might get this kind of error
FATAL: no pg_hba.conf entry for host "XXX.XXX.XXX.XXX", user "flamy", database "hive", SSL off "hive", SSL off
In such case, you will have to:
7.1. connect to the Metastore Host with a sudo user
7.2. edit postgresql’s pg_hba.conf configuration file
If using cloudera and the hive metastore uses the same database as Cloudera Manager:
sudo vi /var/lib/cloudera-scm-server-db/data/pg_hba.conf
sudo vi /etc/postgresql/<PG_VERSION>/main/pg_hba.conf
7.3. Add the following line
# Grant access for user flamy to hive database host hive flamy <IP/MASK> md5
where <IP/MASK> describe the subnetwork that will require to connect to the database with flamy. Check https://www.iplocation.net/subnet-mask for more informations.
7.4. Finally, restart the database (schedule a maintenance for this)
Be careful if the database is using postgresql, and especially if the same database is used by cloudera, since the stop (and the restart) command will block every new connection to the database and wait for all currently open connections to be closed before stopping. This means that you will have to stop the HiveMetastore, HiveServer2, and all Cloudera monitoring services first (the ActivityMonitor, cloudera-scm-server and all cloudera-scm-agents).
If you don’t fear being careless, the fast_stop command should shut down the database immediately and drop the currently open connections.