Search Results

Search found 666 results on 27 pages for 'mongodb'.

Page 5/27 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • What is the recommended MongoDB schema for this quiz-engine scenario?

    - by hughesdan
    I'm working on a quiz engine for learning a foreign language. The engine shows users four images simultaneously and then plays an audio file. The user has to match the audio to the correct image. Below is my MongoDB document structure. Each document consists of an image file reference and an array of references to audio files that match that image. To generate a quiz instance I select four documents at random, show the images and then play one audio file from the four documents at random. The next step in my application development is to decide on the best document schema for storing user guesses. There are several requirements to consider: I need to be able to report statistics at a user level. For example, total correct answers, total guesses, mean accuracy, etc) I need to be able to query images based on the user's learning progress. For example, select 4 documents where guess count is 10 and accuracy is <=0.50. The schema needs to be optimized for fast quiz generation. The schema must not cause future scaling issues vis a vis document size. Assume 1mm users who make an average of 1000 guesses. Given all of this as background information, what would be the recommended schema? For example, would you store each guess in the Image document or perhaps in a User document (not shown) or a new document collection created for logging guesses? Would you recommend logging the raw guess data or would you pre-compute statistics by incrementing counters within the relevant document? Schema for Image Collection: _id "505bcc7a45c978be24000005" date 2012-09-21 02:10:02 UTC imageFileName "BD3E134A-C7B3-4405-9004-ED573DF477FE-29879-0000395CF1091601" random 0.26997075392864645 user "2A8761E4-C13A-470E-A759-91432D61B6AF-25982-0000352D853511AF" audioFiles [ 0 { audioFileName "C3669719-9F0A-4EB5-A791-2C00486665ED-30305-000039A3FDA7DCD2" user "2A8761E4-C13A-470E-A759-91432D61B6AF-25982-0000352D853511AF" audioLanguage "English" date 2012-09-22 01:15:04 UTC } 1 { audioFileName "C3669719-9F0A-4EB5-A791-2C00486665ED-30305-000039A3FDA7DCD2" user "2A8761E4-C13A-470E-A759-91432D61B6AF-25982-0000352D853511AF" audioLanguage "Spanish" date 2012-09-22 01:17:04 UTC } ]

    Read the article

  • Is MongoDB a good choice or not for my application?

    - by shubham
    I have a Reporting application which stores the reports in xml format as recieved from source (XML schema is not defined, it can be any format) and those reports contain some keys and values. Like jobid, setid be keys for 1 type of report and userid, groupId for another type of report etc. The type of keys that can be referred from the document is determined by the namespaces used in the xml doc. These keys are stored on the basis of namespace used in the xml document. For e.g. If a tag in xml fragment uses namespace= "myspace1", then I have keys A and B for myspace1 stored in another table. It will fetch those keys from that table for this namespace, look for their values in xml doc and store it in another table along with the pointer to this xml document (Id of a record storing complete xml document in a cell). Use cases: When the user comes and queries for that key and value, I return the document or a set of documents that are having those key/value pairs. When the user comes and queries for a certain key and provide a name for xslt (pre stored), I fetch the set of documents fulfilling that criteria and convert that xml to html with the specified xslt. When the user comes and asks for a particular fragment of a doc then it can fetch a subset from a particular document also. When the user comes and queries for top x values of a certain key, I return the set of documents that are having top 10 values of that key. I am using DB2 database for its support of xml along with relational capabilities. That makes easier for me to run xpath expressions and fetch values of keys and also aggregate a set of documents fullfilling a criteria, all on the database side. Problems: DB2 stores XML doc of upto 2GB in size. Retrieval is very slow. If some thing involves many documents, then it takes significant time for things to show up in browser, and the user has to wait. Can MongoDb help in this case, as it is document oriented? can I do xml related xpath queries and document transformations on db side? Or is it ok to use both in such a case?

    Read the article

  • TOTD #166: Using NoSQL database in your Java EE 6 Applications on GlassFish - MongoDB for now!

    - by arungupta
    The Java EE 6 platform includes Java Persistence API to work with RDBMS. The JPA specification defines a comprehensive API that includes, but not restricted to, how a database table can be mapped to a POJO and vice versa, provides mechanisms how a PersistenceContext can be injected in a @Stateless bean and then be used for performing different operations on the database table and write typesafe queries. There are several well known advantages of RDBMS but the NoSQL movement has gained traction over past couple of years. The NoSQL databases are not intended to be a replacement for the mainstream RDBMS. As Philosophy of NoSQL explains, NoSQL database was designed for casual use where all the features typically provided by an RDBMS are not required. The name "NoSQL" is more of a category of databases that is more known for what it is not rather than what it is. The basic principles of NoSQL database are: No need to have a pre-defined schema and that makes them a schema-less database. Addition of new properties to existing objects is easy and does not require ALTER TABLE. The unstructured data gives flexibility to change the format of data any time without downtime or reduced service levels. Also there are no joins happening on the server because there is no structure and thus no relation between them. Scalability and performance is more important than the entire set of functionality typically provided by an RDBMS. This set of databases provide eventual consistency and/or transactions restricted to single items but more focus on CRUD. Not be restricted to SQL to access the information stored in the backing database. Designed to scale-out (horizontal) instead of scale-up (vertical). This is important knowing that databases, and everything else as well, is moving into the cloud. RBDMS can scale-out using sharding but requires complex management and not for the faint of heart. Unlike RBDMS which require a separate caching tier, most of the NoSQL databases comes with integrated caching. Designed for less management and simpler data models lead to lower administration as well. There are primarily three types of NoSQL databases: Key-Value stores (e.g. Cassandra and Riak) Document databases (MongoDB or CouchDB) Graph databases (Neo4J) You may think NoSQL is panacea but as I mentioned above they are not meant to replace the mainstream databases and here is why: RDBMS have been around for many years, very stable, and functionally rich. This is something CIOs and CTOs can bet their money on without much worry. There is a reason 98% of Fortune 100 companies run Oracle :-) NoSQL is cutting edge, brings excitement to developers, but enterprises are cautious about them. Commercial databases like Oracle are well supported by the backing enterprises in terms of providing support resources on a global scale. There is a full ecosystem built around these commercial databases providing training, performance tuning, architecture guidance, and everything else. NoSQL is fairly new and typically backed by a single company not able to meet the scale of these big enterprises. NoSQL databases are good for CRUDing operations but business intelligence is extremely important for enterprises to stay competitive. RDBMS provide extensive tooling to generate this data but that was not the original intention of NoSQL databases and is lacking in that area. Generating any meaningful information other than CRUDing require extensive programming. Not suited for complex transactions such as banking systems or other highly transactional applications requiring 2-phase commit. SQL cannot be used with NoSQL databases and writing simple queries can be involving. Enough talking, lets take a look at some code. This blog has published multiple blogs on how to access a RDBMS using JPA in a Java EE 6 application. This Tip Of The Day (TOTD) will show you can use MongoDB (a document-oriented database) with a typical 3-tier Java EE 6 application. Lets get started! The complete source code of this project can be downloaded here. Download MongoDB for your platform from here (1.8.2 as of this writing) and start the server as: arun@ArunUbuntu:~/tools/mongodb-linux-x86_64-1.8.2/bin$./mongod./mongod --help for help and startup optionsSun Jun 26 20:41:11 [initandlisten] MongoDB starting : pid=11210port=27017 dbpath=/data/db/ 64-bit Sun Jun 26 20:41:11 [initandlisten] db version v1.8.2, pdfile version4.5Sun Jun 26 20:41:11 [initandlisten] git version:433bbaa14aaba6860da15bd4de8edf600f56501bSun Jun 26 20:41:11 [initandlisten] build sys info: Linuxbs-linux64.10gen.cc 2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri Nov 2017:48:28 EST 2009 x86_64 BOOST_LIB_VERSION=1_41Sun Jun 26 20:41:11 [initandlisten] waiting for connections on port 27017Sun Jun 26 20:41:11 [websvr] web admin interface listening on port 28017 The default directory for the database is /data/db and needs to be created as: sudo mkdir -p /data/db/sudo chown `id -u` /data/db You can specify a different directory using "--dbpath" option. Refer to Quickstart for your specific platform. Using NetBeans, create a Java EE 6 project and make sure to enable CDI and add JavaServer Faces framework. Download MongoDB Java Driver (2.6.3 of this writing) and add it to the project library by selecting "Properties", "LIbraries", "Add Library...", creating a new library by specifying the location of the JAR file, and adding the library to the created project. Edit the generated "index.xhtml" such that it looks like: <h1>Add a new movie</h1><h:form> Name: <h:inputText value="#{movie.name}" size="20"/><br/> Year: <h:inputText value="#{movie.year}" size="6"/><br/> Language: <h:inputText value="#{movie.language}" size="20"/><br/> <h:commandButton actionListener="#{movieSessionBean.createMovie}" action="show" title="Add" value="submit"/></h:form> This page has a simple HTML form with three text boxes and a submit button. The text boxes take name, year, and language of a movie and the submit button invokes the "createMovie" method of "movieSessionBean" and then render "show.xhtml". Create "show.xhtml" ("New" -> "Other..." -> "Other" -> "XHTML File") such that it looks like: <head> <title><h1>List of movies</h1></title> </head> <body> <h:form> <h:dataTable value="#{movieSessionBean.movies}" var="m" > <h:column><f:facet name="header">Name</f:facet>#{m.name}</h:column> <h:column><f:facet name="header">Year</f:facet>#{m.year}</h:column> <h:column><f:facet name="header">Language</f:facet>#{m.language}</h:column> </h:dataTable> </h:form> This page shows the name, year, and language of all movies stored in the database so far. The list of movies is returned by "movieSessionBean.movies" property. Now create the "Movie" class such that it looks like: import com.mongodb.BasicDBObject;import com.mongodb.BasicDBObject;import com.mongodb.DBObject;import javax.enterprise.inject.Model;import javax.validation.constraints.Size;/** * @author arun */@Modelpublic class Movie { @Size(min=1, max=20) private String name; @Size(min=1, max=20) private String language; private int year; // getters and setters for "name", "year", "language" public BasicDBObject toDBObject() { BasicDBObject doc = new BasicDBObject(); doc.put("name", name); doc.put("year", year); doc.put("language", language); return doc; } public static Movie fromDBObject(DBObject doc) { Movie m = new Movie(); m.name = (String)doc.get("name"); m.year = (int)doc.get("year"); m.language = (String)doc.get("language"); return m; } @Override public String toString() { return name + ", " + year + ", " + language; }} Other than the usual boilerplate code, the key methods here are "toDBObject" and "fromDBObject". These methods provide a conversion from "Movie" -> "DBObject" and vice versa. The "DBObject" is a MongoDB class that comes as part of the mongo-2.6.3.jar file and which we added to our project earlier.  The complete javadoc for 2.6.3 can be seen here. Notice, this class also uses Bean Validation constraints and will be honored by the JSF layer. Finally, create "MovieSessionBean" stateless EJB with all the business logic such that it looks like: package org.glassfish.samples;import com.mongodb.BasicDBObject;import com.mongodb.DB;import com.mongodb.DBCollection;import com.mongodb.DBCursor;import com.mongodb.DBObject;import com.mongodb.Mongo;import java.net.UnknownHostException;import java.util.ArrayList;import java.util.List;import javax.annotation.PostConstruct;import javax.ejb.Stateless;import javax.inject.Inject;import javax.inject.Named;/** * @author arun */@Stateless@Namedpublic class MovieSessionBean { @Inject Movie movie; DBCollection movieColl; @PostConstruct private void initDB() throws UnknownHostException { Mongo m = new Mongo(); DB db = m.getDB("movieDB"); movieColl = db.getCollection("movies"); if (movieColl == null) { movieColl = db.createCollection("movies", null); } } public void createMovie() { BasicDBObject doc = movie.toDBObject(); movieColl.insert(doc); } public List<Movie> getMovies() { List<Movie> movies = new ArrayList(); DBCursor cur = movieColl.find(); System.out.println("getMovies: Found " + cur.size() + " movie(s)"); for (DBObject dbo : cur.toArray()) { movies.add(Movie.fromDBObject(dbo)); } return movies; }} The database is initialized in @PostConstruct. Instead of a working with a database table, NoSQL databases work with a schema-less document. The "Movie" class is the document in our case and stored in the collection "movies". The collection allows us to perform query functions on all movies. The "getMovies" method invokes "find" method on the collection which is equivalent to the SQL query "select * from movies" and then returns a List<Movie>. Also notice that there is no "persistence.xml" in the project. Right-click and run the project to see the output as: Enter some values in the text box and click on enter to see the result as: If you reached here then you've successfully used MongoDB in your Java EE 6 application, congratulations! Some food for thought and further play ... SQL to MongoDB mapping shows mapping between traditional SQL -> Mongo query language. Tutorial shows fun things you can do with MongoDB. Try the interactive online shell  The cookbook provides common ways of using MongoDB In terms of this project, here are some tasks that can be tried: Encapsulate database management in a JPA persistence provider. Is it even worth it because the capabilities are going to be very different ? MongoDB uses "BSonObject" class for JSON representation, add @XmlRootElement on a POJO and how a compatible JSON representation can be generated. This will make the fromXXX and toXXX methods redundant.

    Read the article

  • NoSQL : MongoDB 2.2 est disponible, framework d'agrégation, meilleure gestion des clusters distribués et de l'accès concurrentiel

    NoSQL : MongoDB 2.2 est disponible Framework d'agrégation, meilleure gestion des clusters distribués et amélioration de l'accès concurrentiel 10Gen, l'entreprise derrière le SGBD populaire MongoDB, vient d'annoncer la dernière version de l'outil, à savoir la 2.2. Elle comporte des fonctionnalités anticipées dans la Developer Preview de la 2.1, mais adaptées à un environnement de production. [IMG]http://idelways.developpez.com/news/images/mongodb.png[/IMG]L'apport le plus important de cette version est un nouveau framework d'agrégation en temps réel idéal pour les opérations complexes d'analyse. Celui-ci permettra de récupérer et de manipuler les données dans MongoDB, san...

    Read the article

  • Wrapping my head around MongoDB, mongomapper and joins...

    - by cbmeeks
    I'm new to MongoDB and I've used RDBMS for years. Anyway, let's say I have the following collections: Realtors many :bookmarks key :name Houses key :address, String key :bathrooms, Integer Properties key :address, String key :landtype, String Bookmark key :notes I want a Realtor to be able to bookmark a House and/or a Property. Notice that Houses and Properties are stand-alone and have no idea about Realtors or Bookmarks. I want the Bookmark to be sort of like a "join table" in MySQL. The Houses/Properties come from a different source so they can't be modified. I would like to be able to do this in Rails: r = Realtor.first r.bookmarks would give me: House1 House2 PropertyABC PropertyOO1 etc... There will be thousands of Houses and Properties. I realize that this is what RDBMS were made for. But there are several reasons why I am using MongoDB so I would like to make this work. Any suggestions on how to do something like this would be appreciated. Thanks!

    Read the article

  • Using MongoDB's map/reduce to "group by" two fields

    - by ibz
    I need something slightly more complex than the examples in the MongoDB docs and I can't seem to be able to wrap my head around it. Say I have a collection of objects of the form {date: "2010-10-10", type: "EVENT_TYPE_1", user_id: 123, ...} Now I want to get something similar to a SQL GROUP BY query, grouping over both date and type. That is, I want the number of events of each type in each day. Also, I'd like to make it unique by user_id, ie. if a user has more events in the same day, count it only once. I'm trying to do this with map/reduce. I do db.logs.mapReduce(function() { emit(this.type, 1); }, function(k, vals) { var total = 0; for (var i = 0; i < vals.length; i++) total += vals[i]; return total; }}) which nicely groups by type, but now, how can I group by date at the same time? Seems the key in emit() can't be an array (I thought about doing emit([this.date, this.type], 1)). Also, how can I ensure the per-user uniqueness? I'm just starting with MongoDB and I'm still having trouble grasping the basic concepts. Also, there is not much documentation available out there. Any help from more experienced users is appreciated. Thanks!

    Read the article

  • Mongodb - how to deserialze when a property has an Interface return type

    - by Mark Kelly
    I'm attempting to avoid introducing any dependencies between my Data layer and client code that makes use of this layer, but am running into some problems when attempting to do this with Mongo (using the MongoRepository) MongoRepository shows examples where you create Types that reflect your data structure, and inherit Entity where required. Eg. [CollectionName("track")] public class Track : Entity { public string name { get; set; } public string hash { get; set; } public Artist artist { get; set; } public List<Publish> published {get; set;} public List<Occurence> occurence {get; set;} } In order to make use of these in my client code, I'd like to replace the Mongo-specific types with Interfaces, e.g: [CollectionName("track")] public class Track : Entity, ITrackEntity { public string name { get; set; } public string hash { get; set; } public IArtistEntity artist { get; set; } public List<IPublishEntity> published {get; set;} public List<IOccurenceEntity> occurence {get; set;} } However, the Mongo driver doesn't know how to treat these interfaces, and I understandably get the following error: An error occurred while deserializing the artist property of class sf.data.mongodb.entities.Track: No serializer found for type sf.data.IArtistEntity. --- MongoDB.Bson.BsonSerializationException: No serializer found for type sf.data.IArtistEntity. Does anyone have any suggestions about how I should approach this?

    Read the article

  • Looking for: nosql (redis/mongodb) based event logging for Django

    - by Parand
    I'm looking for a flexible event logging platform to store both pre-defined (username, ip address) and non-pre-defined (can be generated as needed by any piece of code) events for Django. I'm currently doing some of this with log files, but it ends up requiring various analysis scripts and ends up in a DB anyway, so I'm considering throwing it immediately into a nosql store such as MongoDB or Redis. The idea is to be easily able to query, for example, which ip address the user most commonly comes from, whether the user has ever performed some action, lookup the outcome for a specific event, etc. Is there something that already does this? If not, I'm thinking of this: The "event" is a dictionary attached to the request object. Middleware fills in various pieces (username, ip, sql timing), code fills in the rest as needed. After the request is served a post-request hook drops the event into mongodb/redis, normalizing various fields (eg. incrementing the username:ip address counter) and dropping the rest in as is. Words of wisdom / pointers to code that does some/all of this would be appreciated.

    Read the article

  • Mongodb vs. Cassandra

    - by ming yeow
    I am evaluating what might be the best migration option. Currently, i am on a sharded mysql (horizontal partition), with most of my data stored in json blobs. I do not have any complex SQL queries( already migrated away after since I partitioned my db) Right now, it seems like both Mongodb and Cassandra would be likely options. My situation lots of reads in every query, less regular writes not worried about "massive" scalability more concerned about simple setup, maintenance and code minimize hardware/server cost

    Read the article

  • MongoDB transactions?

    - by Arnis L.
    Playing around with MongoDB and NoRM in .NET. Thing that confused me - there are no transactions (can't just tell MongoConnection.Begin/EndTransaction or something like that). I want to use Unit of work pattern and rollback changes in case something fails. Is there still a clean way how to enrich my repository with ITransaction?

    Read the article

  • Mongodb Query To select records having a given key

    - by sagar
    let the records in database are {"_id":"1","fn":"sagar","ln":"Varpe"} {"_id":"1","fn":"sag","score":"10"} {"_id":"1","ln":"ln1","score":"10"} {"_id":"1","ln":"ln2"} I need to design a MongoDB query to find all records who has a given key like if i pass "ln" as a parameter to query it shold return all records in which "ln"is a Key , the results fo are {"_id":"1","fn":"sagar","ln":"Varpe"} {"_id":"1","ln":"ln1","score":"10"} {"_id":"1","ln":"ln2"}

    Read the article

  • MongoDB index/RAM relationship

    - by Tegan Clark
    I'm about to adopt MongoDB for a new project and I've chosen it for flexibility, not scalability. From the documentation and web posts I keep reading that all indexes are in RAM. This just isn't making sense to me as my indexes will easily be larger than the amount of available RAM. Can anyone share some insight on the index/RAM relationship and what happens when both an individual index and all of my indexes exceed the size of available RAM?

    Read the article

  • Calculated group-by fields in MongoDB

    - by Navin Viswanath
    For this example from the MongoDB documentation, how do I write the query using MongoTemplate? db.sales.aggregate( [ { $group : { _id : { month: { $month: "$date" }, day: { $dayOfMonth: "$date" }, year: { $year: "$date" } }, totalPrice: { $sum: { $multiply: [ "$price", "$quantity" ] } }, averageQuantity: { $avg: "$quantity" }, count: { $sum: 1 } } } ] ) Or in general, how do I group by a calculated field?

    Read the article

  • Random record from MongoDB

    - by Will M
    I am looking to get a random record from a huge (100 million record) mongodb. What is the fastest and most efficient way to do so? The data is already there and there are no field in which I can generate a random number and obtain a random row. Any suggestions?

    Read the article

  • MongoDB query against geospatial index with maxDistance fails from node.js client

    - by user1735497
    I want to query against a geospatial index in mongo-db (designed after this tutorial http://www.mongodb.org/display/DOCS/Geospatial+Indexing). So when I execute this from the shell everything works fine: db.sellingpoints.find(( { location : { $near: [48.190120, 16.270895], $maxDistance: 7 / 111.2 } } ); but the same query from my nodejs application (using mongoskin or mongoose), won't return any results until i set the distance-value to a very high number (5690) db.collection('sellingpoints') .find({ location: { $near: [lat,lng], $maxDistance: distance / 111.2} }) .limit(limit) .toArray(callback); Has someone any idea how to fix that?

    Read the article

  • MongoDB equivalent of SQL "OR"

    - by Matt
    So, MongoDB defaults to "AND" when finding records. For example: db.users.find({age: {'$gte': 30}, {'$lte': 40}}); The above query finds users = 30 AND <= 40 years old. How would I find users <= 30 OR = 40 years old?

    Read the article

  • MongoDB PHP Drive install failed (Windows)

    - by terence410
    I attempted to download the php_mongo.dll from this link (http://github.com/mongodb/mongo-php-driver/downloads) and added the extension to php. However it doesn't work at all. My PHP version is: 5.2.x Windows: Windows 7 64 bit. I have also try to use pecl install mongo but it returned some error like: This DSP mongo.dsp doesn't not exist. Could somebody help?

    Read the article

  • mongodb java driver - raw command?

    - by Scott
    Is it possible to execute raw commands as javascript through the Java driver for MongoDB? I'm tired of wrapping everything in Java objects using Rhino, and would happily sacrifice performance for the convenience of passing javascript directly through to the DB. If not, I can always use sleepymongoose or something, but I don't really want to add yet another language (python) to the stack at this point. Any insights are appreciated.

    Read the article

  • MongoDB using NOT and AND together

    - by Stankalank
    I'm trying to negate an $and clause with MongoDB and I'm getting a MongoError: invalid operator: $and message back. Basically what I want to achieve is the following: query = { $not: { $and: [{institution_type:'A'}, {type:'C'}] } } Is this possible to express in a mongo query? Here is a sample collection: { "institution_type" : "A", "type" : "C" } { "institution_type" : "A", "type" : "D" } { "institution_type" : "B", "type" : "C" } { "institution_type" : "B", "type" : "D" } What I want to get back is the following: { "institution_type" : "A", "type" : "D" } { "institution_type" : "B", "type" : "C" } { "institution_type" : "B", "type" : "D" }

    Read the article

  • Indexing on only part of a field in MongoDB

    - by Rob Hoare
    Is there a way to create an index on only part of a field in MongoDB, for example on the first 10 characters? I couldn't find it documented (or asked about on here). The MySQL equivalent would be CREATE INDEX part_of_name ON customer (name(10));. Reason: I have a collection with a single field that varies in length from a few characters up to over 1000 characters, average 50 characters. As there are a hundred million or so documents it's going to be hard to fit the full index in memory (testing with 8% of the data the index is already 400MB, according to stats). Indexing just the first part of the field would reduce the index size by about 75%. In most cases the search term is quite short, it's not a full-text search. A work-around would be to add a second field of 10 (lowercased) characters for each item, index that, then add logic to filter the results if the search term is over ten characters (and that extra field is probably needed anyway for case-insensitive searches, unless anybody has a better way). Seems like an ugly way to do it though. [added later] I tried adding the second field, containing the first 12 characters from the main field, lowercased. It wasn't a big success. Previously, the average object size was 50 bytes, but I forgot that includes the _id and other overheads, so my main field length (there was only one) averaged nearer to 30 bytes than 50. Then, the second field index contains the _id and other overheads. Net result (for my 8% sample) is the index on the main field is 415MB and on the 12 byte field is 330MB - only a 20% saving in space, not worthwhile. I could duplicate the entire field (to work around the case insensitive search problem) but realistically it looks like I should reconsider whether MongoDB is the right tool for the job (or just buy more memory and use twice as much disk space). [added even later] This is a typical document, with the source field, and the short lowercased field: { "_id" : ObjectId("505d0e89f56588f20f000041"), "q" : "Continental Airlines", "f" : "continental " } Indexes: db.test.ensureIndex({q:1}); db.test.ensureIndex({f:1}); The 'f" index, working on a shorter field, is 80% of the size of the "q" index. I didn't mean to imply I included the _id in the index, just that it needs to use that somewhere to show where the index will point to, so it's an overhead that probably helps explain why a shorter key makes so little difference. Access to the index will be essentially random, no part of it is more likely to be accessed than any other. Total index size for the full file will likely be 5GB, so it's not extreme for that one index. Adding some other fields for other search cases, and their associated indexes, and copies of data for lower case, does start to add up, which I why I started looking into a more concise index.

    Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >