Search Results

Search found 1 results on 1 pages for 'denson'.

Page 1/1 | 1 

  • scikit learn extratreeclassifier hanging

    - by denson
    I'm running the scikit learn on some rather large training datasets ~1,600,000,000 rows with ~500 features. The platform is Ubuntu server 14.04, the hardware has 100gb of ram and 20 CPU cores. The test datasets are about half as many rows. I set n_jobs = 10, and am forest_size = 3*number_of_features so about 1700 trees. If I reduce the number of features to about 350 it works fine but never completes the training phase with the full feature set of 500+. The process is still executing and using up about 20gb of ram but is using 0% of CPU. I have also successfully completed on datasets with ~400,000 rows but twice as many features which completes after only about 1 hour. I am being careful to delete any arrays/objects that are not in use. Does anyone have any ideas I might try?

    Read the article

1