Random List of millions of elements in Python Efficiently
Posted
by
eWizardII
on Stack Overflow
See other posts from Stack Overflow
or by eWizardII
Published on 2011-01-08T02:35:07Z
Indexed on
2011/01/08
2:54 UTC
Read the original article
Hit count: 252
Hello,
I have read this answer potentially as the best way to randomize a list of strings in Python. I'm just wondering then if that's the most efficient way to do it because I have a list of about 30 million elements via the following code:
import json
from sets import Set
from random import shuffle
a = []
for i in range(0,193):
json_data = open("C:/Twitter/user/user_" + str(i) + ".json")
data = json.load(json_data)
for j in range(0,len(data)):
a.append(data[j]['su'])
new = list(Set(a))
print "Cleaned length is: " + str(len(new))
## Take Cleaned List and Randomize it for Analysis
shuffle(new)
If there is a more efficient way to do it, I'd greatly appreciate any advice on how to do it.
Thanks,
© Stack Overflow or respective owner