Gradient boosting predictions in low-latency production environments?
Posted
by
lockedoff
on Stack Overflow
See other posts from Stack Overflow
or by lockedoff
Published on 2012-07-02T14:33:16Z
Indexed on
2012/10/30
11:02 UTC
Read the original article
Hit count: 138
machine-learning
|classification
Can anyone recommend a strategy for making predictions using a gradient boosting model in the <10-15ms range (the faster the better)?
I have been using R
's gbm
package, but the first prediction takes ~50ms (subsequent vectorized predictions average to 1ms, so there appears to be overhead, perhaps in the call to the C++ library). As a guideline, there will be ~10-50 inputs and ~50-500 trees. The task is classification and I need access to predicted probabilities.
I know there are a lot of libraries out there, but I've had little luck finding information even on rough prediction times for them. The training will happen offline, so only predictions need to be fast -- also, predictions may come from a piece of code / library that is completely separate from whatever does the training (as long as there is a common format for representing the trees).
© Stack Overflow or respective owner