How to get the top keys from a hash by value
- by Kirs Kringle
I have a hash that I sorted by values greatest to least. How would I go about getting the top 5? There was a post on here that talked about getting only one value.
What is the easiest way to get a key with the highest value from a hash in Perl?
I understand that so would lets say getting those values add them to an array and delete the element in the hash and then do the process again?
Seems like there should be an easier way to do this then that though.
My hash is called %words.
use strict;
use warnings;
use Tk; #Learn to install here: http://factscruncher.blogspot.com/2012/01/easy-way-to-install-tk- on-strawberry.html
#Reading in the text file
my $file0 = Tk::MainWindow->new->Tk::getOpenFile;
open( my $filehandle0, '<', $file0 ) || die "Could not open $file0\n";
my @words;
while ( my $line = <$filehandle0> ) {
chomp $line;
my @word = split( /\s+/, lc($line));
push( @words, @word );
}
for (@words) {
s/[\,|\.|\!|\?|\:|\;|\"]//g;
}
#Counting words that repeat; put in hash
my %words_count;
$words_count{$_}++ for @words;
#Reading in the stopwords file
my $file1 = "stoplist.txt";
open( my $filehandle1, '<', $file1 ) or die "Could not open $file1\n";
my @stopwords;
while ( my $line = <$filehandle1> ) {
chomp $line;
my @linearray = split( " ", $line );
push( @stopwords, @linearray );
}
for my $w ( my @stopwords ) {
s/\b\Q$w\E\B//ig;
}
#Comparing the array to Hash and deleteing stopwords
my %words = %words_count;
for my $stopwords ( @stopwords ) {
delete $words{ $stopwords };
}
#Sorting Hash Table
my @keys = sort {
$words{$b} <=> $words{$a}
or
"\L$a" cmp "\L$b"
} keys %words;
#Starting Statistical Work
my $value_count = 0;
my $key_count = 0;
#Printing Hash Table
$key_count = keys %words;
foreach my $key (@keys) {
$value_count = $words{$key} + $value_count;
printf "%-20s %6d\n", $key, $words{$key};
}
my $value_average = $value_count / $key_count;
#my @topwords;
#foreach my $key (@keys){
#if($words{$key} > $value_average){
# @topwords = keys %words;
# }
#}
print "\n", "The number of values: ", $value_count, "\n";
print "The number of elements: ", $key_count, "\n";
print "The Average: ", $value_average, "\n\n";