Efficient way to get highly correlated pairs from large data set in Python or R
- by Akavall
I have a large data set (Let's say 10,000 variables with about 1000 elements each), we can think of it as 2D list, something like:
[[variable_1],
[variable_2],
............
[variable_n]
]
I want to extract highly correlated variable pairs from that data. I want "highly correlated" to be a parameter that I can choose.
I don't need all pairs…