Search Results

Search found 371 results on 15 pages for 'cassandra clark'.

Page 3/15 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • Cassandra hot keyspace structure change

    - by Pierre
    Hello. I'm currently running a 12-node Cassandra cluster storing 4TB of data, with a replication factor set to 3. For the needs of an application update, we need to change the configuration of our keyspace, and we'd like to avoid any downtime if possible. I read on a mailing list that the best way to do it is to: Kill cassandra process on one server of the cluster Start it again, wait for the commit log to be written on the disk, and kill it again Make the modifications in the storage.xml file Rename or delete the files in the data directories according to the changes we made Start cassandra Goto 1 with next server on the list My questions would be: Did I understand the process well? Is there any risk of data corruption? During the process, there will be servers with different versions of the storage.xml file in the same cluser, same keyspace. Is it a problem? Same question as above if we not only add, rename and remove ColumnFamilies, but if we change the CompareWith parameter / transform an existing column family into a super one. Or do we need to change the name? Thank you for your answers. It's the first time I'll do this, and I'm a little bit scared.

    Read the article

  • Cassandra instead of MySQL for social networking app

    - by Christopher McCann
    I am in the middle of building a new app which will have very similar features to Facebook and although obviously it wont ever have to deal with the likes of 400,000,000 million users it will still be used by a substantial user base and most of them will demand it run very very quickly. I have extensive experience with MySQL but a social app offers complexities which MySQL is not well suited too. I know Facebook, Twitter etc have moved towards Cassandra for a lot of their data but I am not sure how far to go with it. For example would you store such things as user data - username, passwords, addresses etc in Cassandra? Would you store e-mails, comments, status updates etc in Cassandra? I have also read alot that something like neo4j is much better for representing the friend relationships used by social apps as it is a graph database. I am only just starting down the NoSQL route so any guidance is greatly appreciated. Would anyone be able to advise me on this? I hope I am not being too general!

    Read the article

  • Migrating C# code from Cassandra .5 to .6

    - by Jonathan
    I have some some simple code derived from an example that is meant to form a quick write to the Cassandra db, then loop back and read all current entries, everything worked fine. When .6 came out, i upgraded Cassandra and thrift, which threw errors in my code (www[dot]copypastecode[dot]com/26760/) - i was able to iron out the errors by converting the necessary types, however in the version that compiles now only seems to read one item back, im not sure if its not saving db changes or if its only reading back 1 entry. the "fixed" code is here: http://www.copypastecode.com/26752/. Any help would be greatly appreciated.

    Read the article

  • Using Thrift to connect to Cassandra from .NET

    - by vtortola
    Hi, I'm interested in Cassandra and I'd like to test it at home in my Windows XP computer. I've found instructions for install an run Cassandra in Windows, and it's already up and running; I've also found the thrift executable for Windows and generate the C# interfaces, but... when I try to compile that generated code in Visual Studio I got : "The type or namespace name 'Thrift' could not be found (are you missing a using directive or an assembly reference?)", so I'm missing something else, but I cannot find what... What is it? Is it a dll? I've looked in the thrift code and I cannot find anything related to .net , so what am I missing? Thanks in advance. Regards.

    Read the article

  • How do you store sets in Cassandra?

    - by Ben W
    I'd like to convert this JSON to a data model in Cassandra, where each of the arrays is a set with no duplicates: var data = { "data1": { "100": [1, 2, 3], "200": [3, 4] }, "data2": { "k1", [1], "k2", [4, 5] } } I'd like to query like this: data["data1"]["100"] to retrieve the sets. Anyone know how you might model this in Cassandra? (The only thing I came up with was columns whose name was a set value and the value of the column was an empty string, but that felt wrong.) It's not OK to serialize the sets as JSON or some other string, which would make this much easier. Also, I should note that it's OK to split data1 and data2 into separate ColumnFamilies, it's not necessary that they're keys in the same one.

    Read the article

  • Cassandra random read speed

    - by Jody Powlette
    We're still evaluating Cassandra for our data store. As a very simple test, I inserted a value for 4 columns into the Keyspace1/Standard1 column family on my local machine amounting to about 100 bytes of data. Then I read it back as fast as I could by row key. I can read it back at 160,000/second. Great. Then I put in a million similar records all with keys in the form of X.Y where X in (1..10) and Y in (1..100,000) and I queried for a random record. Performance fell to 26,000 queries per second. This is still well above the number of queries we need to support (about 1,500/sec) Finally I put ten million records in from 1.1 up through 10.1000000 and randomly queried for one of the 10 million records. Performance is abysmal at 60 queries per second and my disk is thrashing around like crazy. I also verified that if I ask for a subset of the data, say the 1,000 records between 3,000,000 and 3,001,000, it returns slowly at first and then as they cache, it speeds right up to 20,000 queries per second and my disk stops going crazy. I've read all over that people are storing billions of records in Cassandra and fetching them at 5-6k per second, but I can't get anywhere near that with only 10mil records. Any idea what I'm doing wrong? Is there some setting I need to change from the defaults? I'm on an overclocked Core i7 box with 6gigs of ram so I don't think it's the machine. Here's my code to fetch records which I'm spawning into 8 threads to ask for one value from one column via row key: ColumnPath cp = new ColumnPath(); cp.Column_family = "Standard1"; cp.Column = utf8Encoding.GetBytes("site"); string key = (1+sRand.Next(9)) + "." + (1+sRand.Next(1000000)); ColumnOrSuperColumn logline = client.get("Keyspace1", key, cp, ConsistencyLevel.ONE); Thanks for any insights

    Read the article

  • Mongodb vs. Cassandra

    - by ming yeow
    I am evaluating what might be the best migration option. Currently, i am on a sharded mysql (horizontal partition), with most of my data stored in json blobs. I do not have any complex SQL queries( already migrated away after since I partitioned my db) Right now, it seems like both Mongodb and Cassandra would be likely options. My situation lots of reads in every query, less regular writes not worried about "massive" scalability more concerned about simple setup, maintenance and code minimize hardware/server cost

    Read the article

  • Cassandra: Using LongType

    - by TheDeveloper
    I'm trying to insert data into a ColumnFamily with "CompareWith" attribute "LongType". However, when trying to insert data under numerical keys, I get a thrift error. When attempting the same operation with the cassandra-cli program, I get the error "A long is exactly 8 bytes". How can I resolve this? Should I use a different comparison type? Thanks

    Read the article

  • NoHostAvailableException With Cassandra & DataStax Java Driver If Large ResultSet

    - by hughj
    The setup: 2-node Cassandra 1.2.6 cluster replicas=2 very large CQL3 table with no secondary index Rowkey is a UUID.randomUUID().toString() read consistency set to ONE Using DataStax java driver 1.0 The request: Attempting to do a table scan by "SELECT some-col from schema.table LIMIT nnn;" The fail: Once I go beyond a certain nnn LIMIT, I start to get NoHostAvailableExceptions from the driver. It reads like this: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.181.13.239 ([/10.181.13.239] Unexpected exception triggered)) at com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:64) at com.datastax.driver.core.ResultSetFuture.extractCauseFromExecutionException(ResultSetFuture.java:214) at com.datastax.driver.core.ResultSetFuture.getUninterruptibly(ResultSetFuture.java:169) at com.jpmc.es.rtm.storage.impl.EventExtract.main(EventExtract.java:36) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120) Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.181.13.239 ([/10.181.13.239] Unexpected exception triggered)) at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:98) at com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:165) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) Given: This is probably not the most enlightened thing to do to a large table with millions of rows, but this is how I learn what not to do, so I would really appreciate someone who could volunteer how this kind of error can be debugged. For example, when this happens, there are no indications that the nodes in the cluster ever had an issue with the request (there is nothing in the logs on either node that indicate any timeout or failure). Also, I enabled the trace on the driver, which gives you some nice autotrace (ala Oracle) info as long as the query succeeds. But in this case, the driver blows a NoHostAvailableException and no ExecutionInfo is available, so tracing has not provided any benefit in this case. I also find it interesting that this does not seem to be recorded as a timeout (my JMX consoles tell me no timeouts have occurred). So, I am left not understanding WHERE the failure is actually occurring. I am left with the idea that it is the driver that is having a problem, but I don't know how to debug it (and I would really like to). I have read several posts from folks that state that query'g for resultSets 10000 rows is probably not a good idea, and I am willing to accept this, but I would like to understand what is causing the exception and where the exception is happening. FWIW, I also tried bumping the timeout properties in the cassandra.yaml, but this made no difference whatsoever. I welcome any suggestions, anecdotes, insults, or monetary contributions for my registration in the house of moron-developers. Regards!!

    Read the article

  • Cassandra use PHP SimpleCassie get all keys

    - by chnet
    Is it possible to get all keys in a column family using SimpleCassie? I looked at SimpleCassie's google code, but do not figure out. Another issue is that I used following code to access column value. $price = $cassie-keyspace('ToyStore')-cf('Toys')-key('Transformer')-column('Price')-get(); echo $price; It always complains "object of cassandra columnorsupercolumn cannot be converted to string". Is it possible to print out the column value?

    Read the article

  • Retrieve every key of a column family in Cassandra

    - by Matroska
    Hi all, I have found no way to translate a simple select like SELECT * FROM USER into Cassandra. Is it possible to simply retrieve all the keys in a ColumnFamily? The only one I have found is a select with a key range (get_range_slices). Is there a way to not define the key range? Thanks Tobia Loschiavo

    Read the article

  • Introduction à la base de données NoSQL Cassandra, par Khanh Tuong Maudoux et François Ostyn

    La société So@t, société d'ingénierie et de conseil en informatique vous propose un article sur Cassandra.Il s'agit plus précisément d'un retour de la présentation de Nicolas Romanetti, co-fondateur de la société Jaxio qui a présenté lors de Devoxx France 2012 la base de données NoSQL Open Source Cassandra, faisant partie du projet Apache. http://soat.developpez.com/articles/cassandra/ Vous pouvez profiter de ce message pour partager vos commentaires. Mickael...

    Read the article

  • Cassandra/HBase or just MySQL: Potential problems doing the next thing

    - by alexeypro
    Say I have "user". It's the key. And I need to keep "user count". I am planning to have record with key "user" and value "0" to "9999+ ;-)" (as many as I'll have). What problems I will drive in if I use Cassandra, HBase or MySQL for that? Say, I have thousand of new updates to this "user" key, where I need to increment the value. Am I in trouble? Locked for writes? Any other way of doing that? Why this is done -- there will be a lot of "user"-like keys. Different other cases. But the idea is the same. Why keep it this way -- because I'll have more reads, so I can always get "counted value" very fast.

    Read the article

  • How to use Cassandra's Map Reduce with or w/o Pig?

    - by UltimateBrent
    Can someone explain how MapReduce works with Cassandra .6? I've read through the word count example, but I don't quite follow what's happening on the Cassandra end vs. the "client" end. https://svn.apache.org/repos/asf/cassandra/trunk/contrib/word_count/ For instance, let's say I'm using Python and Pycassa, how would I load in a new map reduce function, and then call it? Does my map reduce function have to be java that's installed on the cassandra server? If so, how do I call it from Pycassa? There's also mention of Pig making this all easier, but I'm a complete Hadoop noob, so that didn't really help. Your answer can use Thrift or whatever, I just mentioned Pycassa to denote the client side. I'm just trying to understand the difference between what runs in the Cassandra cluster vs. the actual server making the requests.

    Read the article

  • Suggest Cassandra data model for an existing schema

    - by Andriy Bohdan
    Hello guys! I hope there's someone who can help me suggest a suitable data model to be implemented using nosql database Apache Cassandra. More of than I need it to work under high loads and large amounts of data. Simplified I have 3 types of objects: Product Tag ProductTag Product: key - string key name - string .... - some other fields Tag: key - string key name - unique tag words ProductTag: product_key - foreign key referring to product tag_key - foreign key referring to tag rating - this is rating of tag for this product Each product may have 0 or many tags. Tag may be assigned to 1 or many products. Means relation between products and tags is many-to-many in terms of relational databases. Value of "rating" is updated "very" often. I need to be run the following queries Select objects by keys Select tags for product ordered by rating Select products by tag order by rating Update rating by product_key and tag_key The most important is to make these queries really fast on large amounts of data, considering that rating is constantly updated.

    Read the article

  • Apache Cassandra overwhelming bandwidth overhead

    - by tanyehzheng
    while testing Apache Cassandra, I inserted 1000 rows of data. I allow it to propagate to the other machine on LAN. This is a 2 machine cluster. I monitor the network connection between the two machine. The total data I expected to flow between the two servers should be around 25Mb including all column names, column values and timestamps). But the actual data sent and received between them was an whopping 362Mb!! Anybody knows why is there such an overwhelming overhead? Thank you

    Read the article

  • Are batch mutations atomic in Cassandra?

    - by user317459
    The Cassandra API supports batch mutations: batch_mutate(keyspace, mutation_map, consistency_level): Executes the specified mutations on the keyspace. mutation_map is a map; the outer map maps the key to the inner map, which maps the column family to the Mutation; can be read as: map. To be more specific, the outer map key is a row key, the inner map key is the column family name. A Mutation specifies either columns to insert or columns to delete. See Mutation and Deletion above for more details. Are all mutations that are executed in a batch executed atomically? So if one of the mutations fails, do the others fail too?

    Read the article

  • Can I use Cassandra to store objects?

    - by Sandeep
    Hi, My application works like this. A database(Mysql) where there is a command. The command is an object(consists of fields many fields like ints ans strings). There is a webservice which interact with the database and get the command from the db and performs some operation. The way how I am storing the command into db is by stripping all the fields and inserting them in to the db. Can I use cassandra in place of mysql and store the command object?

    Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >