Hive performance increase
Posted
by
Sagar Nikam
on Stack Overflow
See other posts from Stack Overflow
or by Sagar Nikam
Published on 2012-11-01T08:17:16Z
Indexed on
2012/11/17
5:01 UTC
Read the original article
Hit count: 145
hive
I am dealing with a database (2.5 GB) having some tables only 40 row to some having 9 million rows data. when I am doing any query for large table it takes more time. I want results in less time
small query on table which have 90 rows only-->
hive> select count(*) from cidade;
Time taken: 50.172 seconds
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
<property>
<name>dfs.block.size</name>
<value>131072</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
</configuration>
does these setting affects performance of hive?
dfs.replication=3
dfs.block.size=131072
can i set it from hive prompt as
hive>set dfs.replication=5
Is this value remains for a perticular session only ?
or Is it better to change it in .xml file ?
© Stack Overflow or respective owner