17

I am looking to get all table definitions in Hive. I know that for single table definition I can use something like -

  describe <<table_name>>
  describe extended <<table_name>>

But, I couldn't find a way to get all table definitions. Is there any table in megastore similar to Information_Schema in mysql or is there command to get all table definitions ?

1

3 Answers 3

27

You can do this by writing a simple bash script and some bash commands.

First, write all table names in a database to a text file using:

$hive -e 'show tables in <dbname>' | tee tables.txt

Then create a bash script (describe_tables.sh) to loop over each table in this list:

while read line
do
 echo "$line"
 eval "hive -e 'describe <dbname>.$line'"
done

Then execute the script:

$chmod +x describe_tables.sh
$./describe_tables.sh < tables.txt > definitions.txt

The definitions.txt file will contain all the table definitions.

Sign up to request clarification or add additional context in comments.

3 Comments

its a good solution but very slow. if the database has 100s of table, each time the bash script does a “hive -e ....” for 100+ times, the execution will take a huge time to complete. Some looping within “hive -e...” is required... something like “hive -e for loop desc <table> “.
Hi Apurvaa, @mostafazh or other... And about diretc SQL to show for example 3-columns <database_name, table_name, column_name> in a CSV file? There are something in Hive similar to this SELECT FROM INFORMATION_SCHEMA?
there is no need to put the eval command. You only need to do hive -e "describe <dbname>.$line" I am also quite doubtful, the code snippelt in its given form will loop
4

The above processes work, however it will be slow due to the fact that the hive connection is made for each query. Instead you can do what I just did for the same need below.

Use one of the above methods to get your list of tables. Then modify the list to make it a hive query for each table as follows:

describe my_table_01;
describe my_TABLE_02;

So you will have a flat file with the all your describe statements mentioned above. For example, if you have the query in a flat file called my_table_description.hql.

Get the output in one scoop as follows:

"hive -f my_table_description.hql > my_table_description.output

It is super fast and gets the output in one shot.

Comments

1
  1. Fetch list of hive databases hive -e 'show databases' > hive_databases.txt

  2. Echo each table's desc:

    cat hive_databases.txt | grep -v '^$' | while read LINE;
    do
      echo "## TableName:" $LINE
      eval "hive -e 'show tables in $LINE' | grep -v ^$ | grep -v Logging | grep -v tab_name | tee $LINE.tables.txt"
      cat $LINE.tables.txt | while read table
      do
        echo "### $LINE.$table" > $LINE.$table.desc.md
        eval "hive -e 'describe $LINE.$table'" >> $LINE.$table.desc.md
        sed -i 's/\t/|/g' ./$LINE.$table.desc.md
        sed -i 's/comment/comment\n|:--:|:--:|:--:|/g' ./$LINE.$table.desc.md
      done
    done

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.