Search Results

Search found 2 results on 1 pages for 'haxney'.

Page 1/1 | 1 

  • Can I use ido-completing-read instead of completing-read everywhere?

    - by haxney
    I'm a big fan of ido-mode, so much so that I would like to use it for things like describe-function or find-tag and so on, without having to write something like in "Can I get ido-mode-style completion for searching tags in Emacs?" for each one. Both (defalias completing-read ido-completing-read) and (setf 'completing-read 'ido-completing-read) don't work, at least partly because ido-completing-read calls completing-read in its body, so any simple redefinition would result in infinite recursion. In theory, it should be possible, since the first line of the docstring for ido-completing-read is "Ido replacement for the built-in completing-read." I've looked around a bit and can't seem to find anyone else who has attempted or succeeded at it. I realize that Icicles [2] probably provides something like this, and I may end up going with that anyway, but it is a bit more of a plunge than I care to take right now. Thanks for any help.

    Read the article

  • Can MySQL reasonably perform queries on billions of rows?

    - by haxney
    I am planning on storing scans from a mass spectrometer in a MySQL database and would like to know whether storing and analyzing this amount of data is remotely feasible. I know performance varies wildly depending on the environment, but I'm looking for the rough order of magnitude: will queries take 5 days or 5 milliseconds? Input format Each input file contains a single run of the spectrometer; each run is comprised of a set of scans, and each scan has an ordered array of datapoints. There is a bit of metadata, but the majority of the file is comprised of arrays 32- or 64-bit ints or floats. Host system |----------------+-------------------------------| | OS | Windows 2008 64-bit | | MySQL version | 5.5.24 (x86_64) | | CPU | 2x Xeon E5420 (8 cores total) | | RAM | 8GB | | SSD filesystem | 500 GiB | | HDD RAID | 12 TiB | |----------------+-------------------------------| There are some other services running on the server using negligible processor time. File statistics |------------------+--------------| | number of files | ~16,000 | | total size | 1.3 TiB | | min size | 0 bytes | | max size | 12 GiB | | mean | 800 MiB | | median | 500 MiB | | total datapoints | ~200 billion | |------------------+--------------| The total number of datapoints is a very rough estimate. Proposed schema I'm planning on doing things "right" (i.e. normalizing the data like crazy) and so would have a runs table, a spectra table with a foreign key to runs, and a datapoints table with a foreign key to spectra. The 200 Billion datapoint question I am going to be analyzing across multiple spectra and possibly even multiple runs, resulting in queries which could touch millions of rows. Assuming I index everything properly (which is a topic for another question) and am not trying to shuffle hundreds of MiB across the network, is it remotely plausible for MySQL to handle this? UPDATE: additional info The scan data will be coming from files in the XML-based mzML format. The meat of this format is in the <binaryDataArrayList> elements where the data is stored. Each scan produces = 2 <binaryDataArray> elements which, taken together, form a 2-dimensional (or more) array of the form [[123.456, 234.567, ...], ...]. These data are write-once, so update performance and transaction safety are not concerns. My naïve plan for a database schema is: runs table | column name | type | |-------------+-------------| | id | PRIMARY KEY | | start_time | TIMESTAMP | | name | VARCHAR | |-------------+-------------| spectra table | column name | type | |----------------+-------------| | id | PRIMARY KEY | | name | VARCHAR | | index | INT | | spectrum_type | INT | | representation | INT | | run_id | FOREIGN KEY | |----------------+-------------| datapoints table | column name | type | |-------------+-------------| | id | PRIMARY KEY | | spectrum_id | FOREIGN KEY | | mz | DOUBLE | | num_counts | DOUBLE | | index | INT | |-------------+-------------| Is this reasonable?

    Read the article

1