Search Results

Search found 58 results on 3 pages for 'svm'.

Page 1/3 | 1 2 3  | Next Page >

  • SVM Classification - minimum number of input sets for each class

    - by Amol Joshi
    Im trying to build an app to detect images which are advertisements from the webpages. Once I detect those Ill not be allowing those to be displayed on the client side. From the help that I got here in stackoverflow, I thought SVM is the best approach to my aim. So, I have coded SVM and an SMO myself. The dataset which I have got from UCI data repository has 3280 instances ( Link to Dataset- http://archive.ics.uci.edu/ml/datasets/Internet+Advertisements )where around 400 of them are from class representing Advertisement images and rest of them representing non-advertisement images. Right now Im taking the first 2800 input sets and training the SVM. But after looking at the accuracy rate I realised that most of those 2800 input sets are from non-advertisement image class. So Im getting very good accuracy for that class. So what can I do here? About how many input set shall I give to SVM to train and how many of them for each class? Thanks. Cheers. ( Basically made a new question because the context was different from my previous question. http://stackoverflow.com/questions/1991113/optimization-of-neural-network-input-data )

    Read the article

  • Save PyML.classifiers.multi.OneAgainstRest(SVM()) object?

    - by Michael Aaron Safyan
    I'm using PYML to construct a multiclass linear support vector machine (SVM). After training the SVM, I would like to be able to save the classifier, so that on subsequent runs I can use the classifier right away without retraining. Unfortunately, the .save() function is not implemented for that classifier, and attempting to pickle it (both with standard pickle and cPickle) yield the following error message: pickle.PicklingError: Can't pickle : it's not found as __builtin__.PySwigObject Does anyone know of a way around this or of an alternative library without this problem? Thanks. Edit/Update I am now training and attempting to save the classifier with the following code: mc = multi.OneAgainstRest(SVM()); mc.train(dataset_pyml,saveSpace=False); for i, classifier in enumerate(mc.classifiers): filename=os.path.join(prefix,labels[i]+".svm"); classifier.save(filename); Notice that I am now saving with the PyML save mechanism rather than with pickling, and that I have passed "saveSpace=False" to the training function. However, I am still gettting an error: ValueError: in order to save a dataset you need to train as: s.train(data, saveSpace = False) However, I am passing saveSpace=False... so, how do I save the classifier(s)? P.S. The project I am using this in is pyimgattr, in case you would like a complete testable example... the program is run with "./pyimgattr.py train"... that will get you this error. Also, a note on version information: [michaelsafyan@codemage /Volumes/Storage/classes/cse559/pyimgattr]$ python Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin Type "help", "copyright", "credits" or "license" for more information. import PyML print PyML.__version__ 0.7.0

    Read the article

  • Issues in Convergence of Sequential minimal optimization for SVM

    - by Amol Joshi
    I have been working on Support Vector Machine for about 2 months now. I have coded SVM myself and for the optimization problem of SVM, I have used Sequential Minimal Optimization(SMO) by Mr. John Platt. Right now I am in the phase where I am going to grid search to find optimal C value for my dataset. ( Please find details of my project application and dataset details here http://stackoverflow.com/questions/2284059/svm-classification-minimum-number-of-input-sets-for-each-class) I have successfully checked my custom implemented SVM`s accuracy for C values ranging from 2^0 to 2^6. But now I am having some issues regarding the convergence of the SMO for C 128. Like I have tried to find the alpha values for C=128 and it is taking long time before it actually converges and successfully gives alpha values. Time taken for the SMO to converge is about 5 hours for C=100. This huge I think ( because SMO is supposed to be fast. ) though I`m getting good accuracy? I am screwed right not because I can not test the accuracy for higher values of C. I am actually displaying number of alphas changed in every pass of SMO and getting 10, 13, 8... alphas changing continuously. The KKT conditions assures convergence so what is so weird happening here? Please note that my implementation is working fine for C<=100 with good accuracy though the execution time is long. Please give me inputs on this issue. Thank You and Cheers.

    Read the article

  • Implementing a linear, binary SVM (support vector machine)

    - by static_rtti
    I want to implement a simple SVM classifier, in the case of high-dimensional binary data (text), for which I think a simple linear SVM is best. The reason for implementing it myself is basically that I want to learn how it works, so using a library is not what I want. The problem is that most tutorials go up to an equation that can be solved as a "quadratic problem", but they never show an actual algorithm! So could you point me either to a very simple implementation I could study, or (better) to a tutorial that goes all the way to the implementation details? Thanks a lot!

    Read the article

  • How do I classify using SVM Classifier?

    - by Gomathi
    I'm doing a project in liver tumor classification. Actually I initially used Region Growing method for liver segmentation and from that I segmented tumor using FCM. I,then, obtained the texture features using Gray Level Co-occurence Matrix. My output for that was stats = autoc: [1.857855266614132e+000 1.857955341199538e+000] contr: [5.103143332457753e-002 5.030548650257343e-002] corrm: [9.512661919561399e-001 9.519459060378332e-001] corrp: [9.512661919561385e-001 9.519459060378338e-001] cprom: [7.885631654779597e+001 7.905268525471267e+001] Now how should I give this as an input to the SVM program. function [itr] = multisvm( T,C,tst ) %MULTISVM(2.0) classifies the class of given training vector according to the % given group and gives us result that which class it belongs. % We have also to input the testing matrix %Inputs: T=Training Matrix, C=Group, tst=Testing matrix %Outputs: itr=Resultant class(Group,USE ROW VECTOR MATRIX) to which tst set belongs %----------------------------------------------------------------------% % IMPORTANT: DON'T USE THIS PROGRAM FOR CLASS LESS THAN 3, % % OTHERWISE USE svmtrain,svmclassify DIRECTLY or % % add an else condition also for that case in this program. % % Modify required data to use Kernel Functions and Plot also% %----------------------------------------------------------------------% % Date:11-08-2011(DD-MM-YYYY) % % This function for multiclass Support Vector Machine is written by % ANAND MISHRA (Machine Vision Lab. CEERI, Pilani, India) % and this is free to use. email: [email protected] % Updated version 2.0 Date:14-10-2011(DD-MM-YYYY) u=unique(C); N=length(u); c4=[]; c3=[]; j=1; k=1; if(N>2) itr=1; classes=0; cond=max(C)-min(C); while((classes~=1)&&(itr<=length(u))&& size(C,2)>1 && cond>0) %This while loop is the multiclass SVM Trick c1=(C==u(itr)); newClass=c1; svmStruct = svmtrain(T,newClass); classes = svmclassify(svmStruct,tst); % This is the loop for Reduction of Training Set for i=1:size(newClass,2) if newClass(1,i)==0; c3(k,:)=T(i,:); k=k+1; end end T=c3; c3=[]; k=1; % This is the loop for reduction of group for i=1:size(newClass,2) if newClass(1,i)==0; c4(1,j)=C(1,i); j=j+1; end end C=c4; c4=[]; j=1; cond=max(C)-min(C); % Condition for avoiding group %to contain similar type of values %and the reduce them to process % This condition can select the particular value of iteration % base on classes if classes~=1 itr=itr+1; end end end end Kindly guide me. Images:

    Read the article

  • Reducing Dimension For SVM in Sensor Network

    - by iinception
    Hi Everyone, I am looking for some suggestions on a problem that I am currently facing. I have a set of sensor say S1-S100 which is triggered when some event E1-E20 is performed. Assume, normally E1 triggers S1-S20, E2 triggers S15-S30, E3 triggers S20-s50 etc and E1-E20 are completely independent events. Occasionally an event E might trigger any other unrelated sensor. I am using ensemble of 20 svm to analyze each event separately. My features are sensor frequency F1-F100, number of times each sensor is triggered and few other related features. I am looking for a technique that can reduce the dimensionality of the sensor feature(F1-F100)/ or some techniques that encompasses all of the sensor and reduces the dimension too(i was looking for some information theory concept for last few days) . I dont think averaging, maximization is a good idea as I risk loosing information(it did not give me good result). Can somebody please suggest what am I missing here? A paper or some starting idea... Thanks in advance.

    Read the article

  • SVM in OpenCV: Visual Studio 2008 reported error wrongly (or is it right?)

    - by Risa
    I'm using MS Visual Studio 2008, OpenCV, C++ and SVM for a OCR-related project. At least I can run the code until yesterday, when I open the project to continue working, VS reported this error: error C2664: 'bool CvSVM::train(const CvMat *,const CvMat *,const CvMat *,const CvMat *,CvSVMParams)' : cannot convert parameter 1 from 'cv::Mat' to 'const CvMat *' It didn't happen before and I haven't changed any code relating to it (I only changed the parameters for the kernel). The code got error is: Mat curTrainData, curTrainLabel; CvSVM svm; . . . svm.train(curTrainData, curTrainLabel, Mat(), Mat(), params); If I hover on the code, I still got this tip: image. Which means my syntax isn't wrong. So why do VS bother to report such an error?

    Read the article

  • kernel warning disk error for command write - solaris svm

    - by help_me
    Recently this warning came up on my message logs, scsi: [ID 107833 kern.warning] WARNING: /pci@1c,600000/scsi@2/sd@0,0 (sd0): Oct 27 00:14:44 Error for Command: write(10) Error Level:Retryable Oct 27 00:14:44 scsi: [ID 107833 kern.notice] Requested Block: 101515828 Error Block: 101515828 Oct 27 00:14:44 scsi: [ID 107833 kern.notice] Vendor: SEAGATE Serial Number: 0441B9B5H Oct 27 00:14:44 scsi: [ID 107833 kern.notice] Sense Key: Hardware Error Oct 27 00:14:44 scsi: [ID 107833 kern.notice] ASC: 0x19 (defect list error), ASCQ: 0x0, FRU: 0x2 This is showing signs of disk failing in my opinion. I have not seen the messages re-occurring. This is on a Solaris 9 Sparc system V240. The disks are managed by SVM and "metadb" is showing the flags as "a" Are there any tests or indications as to check/see if the disk is actually failing or was that error message initiated by something else. Thank you!

    Read the article

  • A good machine learning technique to weed out good URLs from bad

    - by git-noob
    Hi, I have an application that needs to discriminate between good HTTP GET requests and bad. For example: http://somesite.com?passes=dodgy+parameter # BAD http://anothersite.com?passes=a+good+parameter # GOOD My system can make a binary decision about whether or not a URL is good or bad - but ideally I would like it to predict whether or not a previously unseen URL is good or bad. http://some-new-site.com?passes=a+really+dodgy+parameter # BAD I feel the need for a support vector machine (SVM) ... but I need to learn machine learning. Some questions: 1) Is an SVM appropriate for this task? 2) Can I train it with the raw URLs? - without explicitly specifying 'features' 3) How many URLs will I need for it to be good at predictions? 4) What kind of SVM kernel should I use? 5) After I train it, how do I keep it up to date? 6) How do I test unseen URLs again the SVM to decide whether it's good or bad? I

    Read the article

  • Loading a PyML multiclass classifier... why isn't this working?

    - by Michael Aaron Safyan
    This is a followup from "Save PyML.classifiers.multi.OneAgainstRest(SVM()) object?". I am using PyML for a computer vision project (pyimgattr), and have been having trouble storing/loading a multiclass classifier. When attempting to load one of the SVMs in a composite classifier, with loadSVM, I am getting: ValueError: invalid literal for float(): rest Note that this does not happen with the first classifier that I load, only with the second. What is causing this error, and what can I do to get around this so that I can properly load the classifier? Details To better understand the trouble I'm running into, you may want to look at pyimgattr.py (currently revision 11). I am invoking the program with "./pyimgattr.py train" which trains the classifier (invokes train on line 571, which trains the classifier with trainmulticlassclassifier on line 490 and saves it with storemulticlassclassifier on line 529), and then invoking the program with "./pyimgattr.py test" which loads the classifier in order to test it with the testing dataset (invokes test on line 628, which invokes loadmulticlassclassifier on line 549). The multiclass classifier consists of several one-against-rest SVMs which are saved individually. The loadmulticlassclassifier function loads these individually by calling loadSVM() on several different files. It is in this call to loadSVM (done indirectly in loadclassifier on line 517) that I get an error. The first of the one-against-rest classifiers loads successfully, but the second one does not. A transcript is as follows: $ ./pyimgattr.py test [INFO] pyimgattr -- Loading attributes from "classifiers/attributes.lst"... [INFO] pyimgattr -- Loading classnames from "classifiers/classnames.lst"... [INFO] pyimgattr -- Loading dataset "attribute_data/apascal_test.txt"... [INFO] pyimgattr -- Loaded dataset "attribute_data/apascal_test.txt". [INFO] pyimgattr -- Loading multiclass classifier from "classifiers/classnames_from_attributes"... [INFO] pyimgattr -- Constructing object into which to store loaded data... [INFO] pyimgattr -- Loading manifest data... [INFO] pyimgattr -- Loading classifier from "classifiers/classnames_from_attributes/aeroplane.svm".... scanned 100 patterns scanned 200 patterns read 100 patterns read 200 patterns {'50': 38, '60': 45, '61': 46, '62': 47, '49': 37, '52': 39, '53': 40, '24': 16, '25': 17, '26': 18, '27': 19, '20': 12, '21': 13, '22': 14, '23': 15, '46': 34, '47': 35, '28': 20, '29': 21, '40': 32, '41': 33, '1': 1, '0': 0, '3': 3, '2': 2, '5': 5, '4': 4, '7': 7, '6': 6, '8': 8, '58': 44, '39': 31, '38': 30, '15': 9, '48': 36, '16': 10, '19': 11, '32': 24, '31': 23, '30': 22, '37': 29, '36': 28, '35': 27, '34': 26, '33': 25, '55': 42, '54': 41, '57': 43} read 250 patterns in LinearSparseSVModel done LinearSparseSVModel constructed model [INFO] pyimgattr -- Loaded classifier from "classifiers/classnames_from_attributes/aeroplane.svm". [INFO] pyimgattr -- Loading classifier from "classifiers/classnames_from_attributes/bicycle.svm".... label at None delimiter , Traceback (most recent call last): File "./pyimgattr.py", line 797, in sys.exit(main(sys.argv)); File "./pyimgattr.py", line 782, in main return test(attributes_file,classnames_file,testing_annotations_file,testing_dataset_path,classifiers_path,logger); File "./pyimgattr.py", line 635, in test multiclass_classnames_from_attributes_classifier = loadmulticlassclassifier(classnames_from_attributes_folder,logger); File "./pyimgattr.py", line 529, in loadmulticlassclassifier classifiers.append(loadclassifier(os.path.join(filename,label+".svm"),logger)); File "./pyimgattr.py", line 502, in loadclassifier result=loadSVM(filename,datasetClass = SparseDataSet); File "/Library/Python/2.6/site-packages/PyML/classifiers/svm.py", line 328, in loadSVM data = datasetClass(fileName, **args) File "/Library/Python/2.6/site-packages/PyML/containers/vectorDatasets.py", line 224, in __init__ BaseVectorDataSet.__init__(self, arg, **args) File "/Library/Python/2.6/site-packages/PyML/containers/baseDatasets.py", line 214, in __init__ self.constructFromFile(arg, **args) File "/Library/Python/2.6/site-packages/PyML/containers/baseDatasets.py", line 243, in constructFromFile for x in parser : File "/Library/Python/2.6/site-packages/PyML/containers/parsers.py", line 426, in next x = [float(token) for token in tokens[self._first:self._last]] ValueError: invalid literal for float(): rest

    Read the article

  • PyML 0.7.2 - How to prevent accuracy from dropping after storing/loading a classifier?

    - by Michael Aaron Safyan
    This is a followup from "Save PyML.classifiers.multi.OneAgainstRest(SVM()) object?". The solution to that question was close, but not quite right, (the SparseDataSet is broken, so attempting to save/load with that dataset container type will fail, no matter what. Also, PyML is inconsistent in terms of whether labels should be numbers or strings... it turns out that the oneAgainstRest function is actually not good enough, because the labels need to be strings and simultaneously convertible to floats, because there are places where it is assumed to be a string and elsewhere converted to float) and so after a great deal of hacking and such I was finally able to figure out a way to save and load my multi-class classifier without it blowing up with an error.... however, although it is no longer giving me an error message, it is still not quite right as the accuracy of the classifier drops significantly when it is saved and then reloaded (so I'm still missing a piece of the puzzle). I am currently using the following custom mutli-class classifier for training, saving, and loading: class SVM(object): def __init__(self,features_or_filename,labels=None,kernel=None): if isinstance(features_or_filename,str): filename=features_or_filename; if labels!=None: raise ValueError,"Labels must be None if loading from a file."; with open(os.path.join(filename,"uniquelabels.list"),"rb") as uniquelabelsfile: self.uniquelabels=sorted(list(set(pickle.load(uniquelabelsfile)))); self.labeltoindex={}; for idx,label in enumerate(self.uniquelabels): self.labeltoindex[label]=idx; self.classifiers=[]; for classidx, classname in enumerate(self.uniquelabels): self.classifiers.append(PyML.classifiers.svm.loadSVM(os.path.join(filename,str(classname)+".pyml.svm"),datasetClass = PyML.VectorDataSet)); else: features=features_or_filename; if labels==None: raise ValueError,"Labels must not be None when training."; self.uniquelabels=sorted(list(set(labels))); self.labeltoindex={}; for idx,label in enumerate(self.uniquelabels): self.labeltoindex[label]=idx; points = [[float(xij) for xij in xi] for xi in features]; self.classifiers=[PyML.SVM(kernel) for label in self.uniquelabels]; for i in xrange(len(self.uniquelabels)): currentlabel=self.uniquelabels[i]; currentlabels=['+1' if k==currentlabel else '-1' for k in labels]; currentdataset=PyML.VectorDataSet(points,L=currentlabels,positiveClass='+1'); self.classifiers[i].train(currentdataset,saveSpace=False); def accuracy(self,pts,labels): logger=logging.getLogger("ml"); correct=0; total=0; classindexes=[self.labeltoindex[label] for label in labels]; h=self.hypotheses(pts); for idx in xrange(len(pts)): if h[idx]==classindexes[idx]: logger.info("RIGHT: Actual \"%s\" == Predicted \"%s\"" %(self.uniquelabels[ classindexes[idx] ], self.uniquelabels[ h[idx] ])); correct+=1; else: logger.info("WRONG: Actual \"%s\" != Predicted \"%s\"" %(self.uniquelabels[ classindexes[idx] ], self.uniquelabels[ h[idx] ])) total+=1; return float(correct)/float(total); def prediction(self,pt): h=self.hypothesis(pt); if h!=None: return self.uniquelabels[h]; return h; def predictions(self,pts): h=self.hypotheses(self,pts); return [self.uniquelabels[x] if x!=None else None for x in h]; def hypothesis(self,pt): bestvalue=None; bestclass=None; dataset=PyML.VectorDataSet([pt]); for classidx, classifier in enumerate(self.classifiers): val=classifier.decisionFunc(dataset,0); if (bestvalue==None) or (val>bestvalue): bestvalue=val; bestclass=classidx; return bestclass; def hypotheses(self,pts): bestvalues=[None for pt in pts]; bestclasses=[None for pt in pts]; dataset=PyML.VectorDataSet(pts); for classidx, classifier in enumerate(self.classifiers): for ptidx in xrange(len(pts)): val=classifier.decisionFunc(dataset,ptidx); if (bestvalues[ptidx]==None) or (val>bestvalues[ptidx]): bestvalues[ptidx]=val; bestclasses[ptidx]=classidx; return bestclasses; def save(self,filename): if not os.path.exists(filename): os.makedirs(filename); with open(os.path.join(filename,"uniquelabels.list"),"wb") as uniquelabelsfile: pickle.dump(self.uniquelabels,uniquelabelsfile,pickle.HIGHEST_PROTOCOL); for classidx, classname in enumerate(self.uniquelabels): self.classifiers[classidx].save(os.path.join(filename,str(classname)+".pyml.svm")); I am using the latest version of PyML (0.7.2, although PyML.__version__ is 0.7.0). When I construct the classifier with a training dataset, the reported accuracy is ~0.87. When I then save it and reload it, the accuracy is less than 0.001. So, there is something here that I am clearly not persisting correctly, although what that may be is completely non-obvious to me. Would you happen to know what that is?

    Read the article

  • PyML 0.7.2 - How to prevent accuracy from dropping after stroing/loading a classifier?

    - by Michael Aaron Safyan
    This is a followup from "Save PyML.classifiers.multi.OneAgainstRest(SVM()) object?". The solution to that question was close, but not quite right, (the SparseDataSet is broken, so attempting to save/load with that dataset container type will fail, no matter what. Also, PyML is inconsistent in terms of whether labels should be numbers or strings... it turns out that the oneAgainstRest function is actually not good enough, because the labels need to be strings and simultaneously convertible to floats, because there are places where it is assumed to be a string and elsewhere converted to float) and so after a great deal of hacking and such I was finally able to figure out a way to save and load my multi-class classifier without it blowing up with an error.... however, although it is no longer giving me an error message, it is still not quite right as the accuracy of the classifier drops significantly when it is saved and then reloaded (so I'm still missing a piece of the puzzle). I am currently using the following custom mutli-class classifier for training, saving, and loading: class SVM(object): def __init__(self,features_or_filename,labels=None,kernel=None): if isinstance(features_or_filename,str): filename=features_or_filename; if labels!=None: raise ValueError,"Labels must be None if loading from a file."; with open(os.path.join(filename,"uniquelabels.list"),"rb") as uniquelabelsfile: self.uniquelabels=sorted(list(set(pickle.load(uniquelabelsfile)))); self.labeltoindex={}; for idx,label in enumerate(self.uniquelabels): self.labeltoindex[label]=idx; self.classifiers=[]; for classidx, classname in enumerate(self.uniquelabels): self.classifiers.append(PyML.classifiers.svm.loadSVM(os.path.join(filename,str(classname)+".pyml.svm"),datasetClass = PyML.VectorDataSet)); else: features=features_or_filename; if labels==None: raise ValueError,"Labels must not be None when training."; self.uniquelabels=sorted(list(set(labels))); self.labeltoindex={}; for idx,label in enumerate(self.uniquelabels): self.labeltoindex[label]=idx; points = [[float(xij) for xij in xi] for xi in features]; self.classifiers=[PyML.SVM(kernel) for label in self.uniquelabels]; for i in xrange(len(self.uniquelabels)): currentlabel=self.uniquelabels[i]; currentlabels=['+1' if k==currentlabel else '-1' for k in labels]; currentdataset=PyML.VectorDataSet(points,L=currentlabels,positiveClass='+1'); self.classifiers[i].train(currentdataset,saveSpace=False); def accuracy(self,pts,labels): logger=logging.getLogger("ml"); correct=0; total=0; classindexes=[self.labeltoindex[label] for label in labels]; h=self.hypotheses(pts); for idx in xrange(len(pts)): if h[idx]==classindexes[idx]: logger.info("RIGHT: Actual \"%s\" == Predicted \"%s\"" %(self.uniquelabels[ classindexes[idx] ], self.uniquelabels[ h[idx] ])); correct+=1; else: logger.info("WRONG: Actual \"%s\" != Predicted \"%s\"" %(self.uniquelabels[ classindexes[idx] ], self.uniquelabels[ h[idx] ])) total+=1; return float(correct)/float(total); def prediction(self,pt): h=self.hypothesis(pt); if h!=None: return self.uniquelabels[h]; return h; def predictions(self,pts): h=self.hypotheses(self,pts); return [self.uniquelabels[x] if x!=None else None for x in h]; def hypothesis(self,pt): bestvalue=None; bestclass=None; dataset=PyML.VectorDataSet([pt]); for classidx, classifier in enumerate(self.classifiers): val=classifier.decisionFunc(dataset,0); if (bestvalue==None) or (val>bestvalue): bestvalue=val; bestclass=classidx; return bestclass; def hypotheses(self,pts): bestvalues=[None for pt in pts]; bestclasses=[None for pt in pts]; dataset=PyML.VectorDataSet(pts); for classidx, classifier in enumerate(self.classifiers): for ptidx in xrange(len(pts)): val=classifier.decisionFunc(dataset,ptidx); if (bestvalues[ptidx]==None) or (val>bestvalues[ptidx]): bestvalues[ptidx]=val; bestclasses[ptidx]=classidx; return bestclasses; def save(self,filename): if not os.path.exists(filename): os.makedirs(filename); with open(os.path.join(filename,"uniquelabels.list"),"wb") as uniquelabelsfile: pickle.dump(self.uniquelabels,uniquelabelsfile,pickle.HIGHEST_PROTOCOL); for classidx, classname in enumerate(self.uniquelabels): self.classifiers[classidx].save(os.path.join(filename,str(classname)+".pyml.svm")); I am using the latest version of PyML (0.7.2, although PyML.__version__ is 0.7.0). When I construct the classifier with a training dataset, the reported accuracy is ~0.87. When I then save it and reload it, the accuracy is less than 0.001. So, there is something here that I am clearly not persisting correctly, although what that may be is completely non-obvious to me. Would you happen to know what that is?

    Read the article

  • measuring uncertainty in matlabs svmclassify

    - by Mark
    I'm doing contextual object recognition and I need a prior for my observations. e.g. this space was labeled "dog", what's the probability that it was labeled correctly? Do you know if matlabs svmclassify has an argument to return this level of certainty with it's classification? If not, matlabs svm has the following structures in it: SVM = SupportVectors: [11x124 single] Alpha: [11x1 double] Bias: 0.0915 KernelFunction: @linear_kernel KernelFunctionArgs: {} GroupNames: {11x1 cell} SupportVectorIndices: [11x1 double] ScaleData: [1x1 struct] FigureHandles: [] Can you think of any ways to compute a good measure of uncertainty from these? (Which support vector to use?) Papers/articles explaining uncertainty in SVMs welcome. More in depth explanations of matlabs SVM are also welcome. If you can't do it this way, can you think of any other libraries with SVMs that have this measure of uncertainty?

    Read the article

  • High volume SVM (machine learning) system

    - by flyingcrab
    I working on a possible machine learning project that would be expected to do high speed computations for machine learning using SVM (support vector machines) and possibly some ANN. I'm resonably comfortable working on matlab with these, but primarly in small datasets, just for experimentation. I'm wondering if this matlab based approach will scale? or should i be looking into something else? C++ / gpu based computing? java wrapping of the matlab code and pushing it onto app engine? Incidentally, there seems to be a lot fo literature on GPUs, but not much on how useful they are on machine learning applications using matlab, & the cheapest CUDA enlabled GPU money can buy? is it even worth the trouble?

    Read the article

  • Memory allocation problem with SVMs in OpenCV

    - by worksintheory
    Hi, I've been using OpenCV happily for a while, but now I have a problem which has bugged me for quite some time. The following code is reasonably minimal example of my problem: #include <cv.h> #include <ml.h> using namespace cv; int main(int argc, char **argv) { int sampleCountForTesting = 2731; //BROKEN: Breaks svm.train_auto(...) for values of 2731 or greater! Mat trainingData( sampleCountForTesting, 1, CV_32FC1, Scalar::all(0.0) ); Mat trainingResponses( sampleCountForTesting, 1, CV_32FC1, Scalar::all(0.0) ); for(int j = 0; j < 6; j++) { trainingData.at<float>( j, 0 ) = (float) (j%2); trainingResponses.at<float>( j, 0 ) = (float) (j%2); //Setting a few values so I don't get a "single class" error } CvSVMParams svmParams( 100, //100 is CvSVM::C_SVC, 2, //2 is CvSVM::RBF, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, NULL, TermCriteria( TermCriteria::MAX_ITER | TermCriteria::EPS, 2, 1.0 ) ); CvSVM svm = CvSVM(); svm.train_auto( trainingData, trainingResponses, Mat(), Mat(), svmParams ); return 0; } I just create matrices to hold the training data and responses, then set a few entries to some value other than zero, then run the SVM. But it breaks whenever there are 2731 rows or more: OpenCV Error: One of arguments' values is out of range (requested size is negative or too big) in cvMemStorageAlloc, file [omitted]/opencv/OpenCV-2.2.0/modules/core/src/datastructs.cpp, line 332 With fewer rows, it seems to be fine and a classifier trained in a similar manner to the above seems to be giving reasonable output. Am I doing something wrong? I'm pretty sure it's not actually anything to do with lack of memory, as I've got 6GB and also the code works fine when the data has 2730 rows and 10000 columns, which is a much bigger allocation. I'm running OpenCV 2.2 on OSX 10.6 and initially I thought the problem might be related to this bug if for some reason the fix wasn't included in the MacPorts version. Now I've also tried downloading the most recent stable version from the OpenCV site and building with cmake and using that, but I still get the same error, and the fix is definitely included in that version. Any help would be much appreciated! Thanks,

    Read the article

  • How to compute the probability of a multi-class prediction using libsvm?

    - by Cuga
    I'm using libsvm and the documentation leads me to believe that there's a way to output the believed probability of an output classification's accuracy. Is this so? And if so, can anyone provide a clear example of how to do it in code? Currently, I'm using the Java libraries in the following manner SvmModel model = Svm.svm_train(problem, parameters); SvmNode x[] = getAnArrayOfSvmNodesForProblem(); double predictedValue = Svm.svm_predict(model, x);

    Read the article

  • Which machine learning library to use

    - by Space_C0wb0y
    I am looking for a library that, ideally, has the following features: implements hierarchical clustering of multidimensional data (ideally on similiarity or distance matrix) implements support vector machines is in C++ is somewhat documented (this one seems to be hardest) I would like this to be in C++, as I am most comfortable with that language, but I will also use any other language if the library is worth it. I have googled and found some, but I do not really have the time to try them all out, so I want hear what other people had for experiences. Please only answer if you have some experience with the library you recommend. P.S.: I could also use different libraries for the clustering and the SVM.

    Read the article

  • Calculating Nearest Match to Mean/Stddev Pair With LibSVM

    - by Chris S
    I'm new to SVMs, and I'm trying to use the Python interface to libsvm to classify a sample containing a mean and stddev. However, I'm getting nonsensical results. Is this task inappropriate for SVMs or is there an error in my use of libsvm? Below is the simple Python script I'm using to test: #!/usr/bin/env python # Simple classifier test. # Adapted from the svm_test.py file included in the standard libsvm distribution. from collections import defaultdict from svm import * # Define our sparse data formatted training and testing sets. labels = [1,2,3,4] train = [ # key: 0=mean, 1=stddev {0:2.5,1:3.5}, {0:5,1:1.2}, {0:7,1:3.3}, {0:10.3,1:0.3}, ] problem = svm_problem(labels, train) test = [ ({0:3, 1:3.11},1), ({0:7.3,1:3.1},3), ({0:7,1:3.3},3), ({0:9.8,1:0.5},4), ] # Test classifiers. kernels = [LINEAR, POLY, RBF] kname = ['linear','polynomial','rbf'] correct = defaultdict(int) for kn,kt in zip(kname,kernels): print kt param = svm_parameter(kernel_type = kt, C=10, probability = 1) model = svm_model(problem, param) for test_sample,correct_label in test: pred_label, pred_probability = model.predict_probability(test_sample) correct[kn] += pred_label == correct_label # Show results. print '-'*80 print 'Accuracy:' for kn,correct_count in correct.iteritems(): print '\t',kn, '%.6f (%i of %i)' % (correct_count/float(len(test)), correct_count, len(test)) The domain seems fairly simple. I'd expect that if it's trained to know a mean of 2.5 means label 1, then when it sees a mean of 2.4, it should return label 1 as the most likely classification. However, each kernel has an accuracy of 0%. Why is this? On a side note, is there a way to hide all the verbose training output dumped by libsvm in the terminal? I've searched libsvm's docs and code, but I can't find any way to turn this off.

    Read the article

  • Is there a test to see if hardware virtualization (vmx / svm) are presently enabled within a Linux session?

    - by Dr. Edward Morbius
    I'm writing procedures for configuring VirtualBox support for 64-bit SMP guests, which requires hardware virtualization suppot (VTx/Intel, AMD-V/AMD). I have successfully configured this myself, however I'd like the procedure to be clear. sed -ne '/^flags/s/^.*: //p' /proc/cpuinfo | egrep -q '(vmx|svm)' && echo Has hardware virt || echo No HW virt ... shows if the CPU is capable. I've still got to go enable the feature in BIOS. Any way to test from within Linux to see that this is no or not? Thanks. (Edit: s/xvm/svm/ in title)

    Read the article

  • Nominal Attributes in LibSVM

    - by Chris S
    When creating a libsvm training file, how do you differentiate between a nominal attribute verses a numeric attribute? I'm trying to encode certain nominal attributes as integers, but I want to ensure libsvm doesn't misinterpret them as numeric values. Unfortunately, libsvm's site seems to have very little documentation. Pentaho's docs seem to imply libsvm makes this distinction, but I'm still not clear how it's made.

    Read the article

  • How to figure out optimal C / Gamma parameters in libsvm?

    - by Cuga
    I'm using libsvm for multi-class classification of datasets with a large number of features/attributes (around 5,800 per each item). I'd like to choose better parameters for C and Gamma than the defaults I am currently using. I've already tried running easy.py, but for the datasets I'm using, the estimated time is near forever (ran easy.py at 20, 50, 100, and 200 data samples and got a super-linear regression which projected my necessary runtime to take years). Is there a way to more quickly arrive at better C and Gamma values than the defaults? I'm using the Java libraries, if that makes any difference.

    Read the article

  • How do I classify using GLCM and SVM Classifier in Matlab?

    - by Gomathi
    I'm on a project of liver tumor segmentation and classification. I used Region Growing and FCM for liver and tumor segmentation respectively. Then, I used Gray Level Co-occurence matrix for texture feature extraction. I have to use Support Vector Machine for Classification. But I don't know how to normalize the feature vectors. Can anyone tell how to program it in Matlab? To the GLCM program, I gave the tumor segmented image as input. Was I correct? If so, I think, then, my output will also be correct. My glcm coding, as far as I have tried is, I = imread('fzliver3.jpg'); GLCM = graycomatrix(I,'Offset',[2 0;0 2]); stats = graycoprops(GLCM,'all') t1= struct2array(stats) I2 = imread('fzliver4.jpg'); GLCM2 = graycomatrix(I2,'Offset',[2 0;0 2]); stats2 = graycoprops(GLCM2,'all') t2= struct2array(stats2) I3 = imread('fzliver5.jpg'); GLCM3 = graycomatrix(I3,'Offset',[2 0;0 2]); stats3 = graycoprops(GLCM3,'all') t3= struct2array(stats3) t=[t1;t2;t3] xmin = min(t); xmax = max(t); scale = xmax-xmin; tf=(x-xmin)/scale Was this a correct implementation? Also, I get an error at the last line. My output is: stats = Contrast: [0.0510 0.0503] Correlation: [0.9513 0.9519] Energy: [0.8988 0.8988] Homogeneity: [0.9930 0.9935] t1 = Columns 1 through 6 0.0510 0.0503 0.9513 0.9519 0.8988 0.8988 Columns 7 through 8 0.9930 0.9935 stats2 = Contrast: [0.0345 0.0339] Correlation: [0.8223 0.8255] Energy: [0.9616 0.9617] Homogeneity: [0.9957 0.9957] t2 = Columns 1 through 6 0.0345 0.0339 0.8223 0.8255 0.9616 0.9617 Columns 7 through 8 0.9957 0.9957 stats3 = Contrast: [0.0230 0.0246] Correlation: [0.7450 0.7270] Energy: [0.9815 0.9813] Homogeneity: [0.9971 0.9970] t3 = Columns 1 through 6 0.0230 0.0246 0.7450 0.7270 0.9815 0.9813 Columns 7 through 8 0.9971 0.9970 t = Columns 1 through 6 0.0510 0.0503 0.9513 0.9519 0.8988 0.8988 0.0345 0.0339 0.8223 0.8255 0.9616 0.9617 0.0230 0.0246 0.7450 0.7270 0.9815 0.9813 Columns 7 through 8 0.9930 0.9935 0.9957 0.9957 0.9971 0.9970 ??? Error using ==> minus Matrix dimensions must agree. The images are:

    Read the article

1 2 3  | Next Page >