Search Results

Search found 58499 results on 2340 pages for 'temporal data'.

Page 151/2340 | < Previous Page | 147 148 149 150 151 152 153 154 155 156 157 158 | Next Page >

Sensible Way to Pass Web Data to Sql Server Database

- by Emtucifor

After exploring several different ways to pass web data to a database for update purposes, I'm wondering if XML might be a good strategy. The database is currently SQL 2000. In a few months it will move to SQL 2005 and I will be able to change things if needed, but I need a SQL 2000 solution now. First of all, the database in question uses the EAV model. I know that this kind of database is generally highly frowned on, so for the purposes of this question, please just accept that this is not going to change. The current update method has the web server inserting values (that have all been converted first to their correct underlying types, then to sql_variant) to a temp table. A stored procedure is then run which expects the temp table to exist and it takes care of updating, inserting, or deleting things as needed. So far, only a single element has needed to be updated at a time. But now, there is a requirement to be able to edit multiple elements at once, and also to support hierarchical elements, each of which can have its own list of attributes. Here's some example XML I hand-typed to demonstrate what I'm thinking of. Note that in this database the Entity is Element and an ID of 0 signifies "create" aka an insert of a new item. <Elements> <Element ID="1234"> <Attr ID="221">Value</Attr> <Attr ID="225">287</Attr> <Attr ID="234"> <Element ID="99825"> <Attr ID="7">Value1</Attr> <Attr ID="8">Value2</Attr> <Attr ID="9" Action="delete" /> </Element> <Element ID="99826" Action="delete" /> <Element ID="0" Type="24"> <Attr ID="7">Value4</Attr> <Attr ID="8">Value5</Attr> <Attr ID="9">Value6</Attr> </Element> <Element ID="0" Type="24"> <Attr ID="7">Value7</Attr> <Attr ID="8">Value8</Attr> <Attr ID="9">Value9</Attr> </Element> </Attr> <Rel ID="3827" Action="delete" /> <Rel ID="2284" Role="parent"> <Element ID="3827" /> <Element ID="3829" /> <Attr ID="665">1</Attr> </Rel> <Rel ID="0" Type="23" Role="child"> <Element ID="3830" /> <Attr ID="67" </Rel> </Element> <Element ID="0" Type="87"> <Attr ID="221">Value</Attr> <Attr ID="225">569</Attr> <Attr ID="234"> <Element ID="0" Type="24"> <Attr ID="7">Value10</Attr> <Attr ID="8">Value11</Attr> <Attr ID="9">Value12</Attr> </Element> </Attr> </Element> <Element ID="1235" Action="delete" /> </Elements> Some Attributes are straight value types, such as AttrID 221. But AttrID 234 is a special "multi-value" type that can have a list of elements underneath it, and each one can have one or more values. Types only need to be presented when a new item is created, since the ElementID fully implies the type if it already exists. I'll probably support only passing in changed items (as detected by javascript). And there may be an Action="Delete" on Attr elements as well, since NULLs are treated as "unselected"--sometimes it's very important to know if a Yes/No question has intentionally been answered No or if no one's bothered to say Yes yet. There is also a different kind of data, a Relationship. At this time, those are updated through individual AJAX calls as things are edited in the UI, but I'd like to include those so that changes to relationships can be canceled (right now, once you change it, it's done). So those are really elements, too, but they are called Rel instead of Element. Relationships are implemented as ElementID1 and ElementID2, so the RelID 2284 in the XML above is in the database as: ElementID 2284 ElementID1 1234 ElementID2 3827 Having multiple children in one relationship isn't currently supported, but it would be nice later. Does this strategy and the example XML make sense? Is there a more sensible way? I'm just looking for some broad critique to help save me from going down a bad path. Any aspect that you'd like to comment on would be helpful. The web language happens to be Classic ASP, but that could change to ASP.Net at some point. A persistence engine like Linq or nHibernate is probably not acceptable right now--I just want to get this already working application enhanced without a huge amount of development time. I'll choose the answer that shows experience and has a balance of good warnings about what not to do, confirmations of what I'm planning to do, and recommendations about something else to do. I'll make it as objective as possible. P.S. I'd like to handle unicode characters as well as very long strings (10k +). UPDATE I have had this working for some time and I used the ADO Recordset Save-To-Stream trick to make creating the XML really easy. The result seems to be fairly fast, though if speed ever becomes a problem I may revisit this. In the meantime, my code works to handle any number of elements and attributes on the page at once, including updating, deleting, and creating new items all in one go. I settled on a scheme like so for all my elements: Existing data elements Example: input name e12345_a678 (element 12345, attribute 678), the input value is the value of the attribute. New elements Javascript copies a hidden template of the set of HTML elements needed for the type into the correct location on the page, increments a counter to get a new ID for this item, and prepends the number to the names of the form items. var newid = 0; function metadataAdd(reference, nameid, value) { var t = document.createElement('input'); t.setAttribute('name', nameid); t.setAttribute('id', nameid); t.setAttribute('type', 'hidden'); t.setAttribute('value', value); reference.appendChild(t); } function multiAdd(target, parentelementid, attrid, elementtypeid) { var proto = document.getElementById('a' + attrid + '_proto'); var instance = document.createElement('p'); target.parentNode.parentNode.insertBefore(instance, target.parentNode); var thisid = ++newid; instance.innerHTML = proto.innerHTML.replace(/{prefix}/g, 'n' + thisid + '_'); instance.id = 'n' + thisid; instance.className += ' new'; metadataAdd(instance, 'n' + thisid + '_p', parentelementid); metadataAdd(instance, 'n' + thisid + '_c', attrid); metadataAdd(instance, 'n' + thisid + '_t', elementtypeid); return false; } Example: Template input name _a678 becomes n1_a678 (a new element, the first one on the page, attribute 678). all attributes of this new element are tagged with the same prefix of n1. The next new item will be n2, and so on. Some hidden form inputs are created: n1_t, value is the elementtype of the element to be created n1_p, value is the parent id of the element (if it is a relationship) n1_c, value is the child id of the element (if it is a relationship) Deleting elements A hidden input is created in the form e12345_t with value set to 0. The existing controls displaying that attribute's values are disabled so they are not included in the form post. So "set type to 0" is treated as delete. With this scheme, every item on the page has a unique name and can be distinguished properly, and every action can be represented properly. When the form is posted, here's a sample of building one of the two recordsets used (classic ASP code): Set Data = Server.CreateObject("ADODB.Recordset") Data.Fields.Append "ElementID", adInteger, 4, adFldKeyColumn Data.Fields.Append "AttrID", adInteger, 4, adFldKeyColumn Data.Fields.Append "Value", adLongVarWChar, 2147483647, adFldIsNullable Or adFldMayBeNull Data.CursorLocation = adUseClient Data.CursorType = adOpenDynamic Data.Open This is the recordset for values, the other is for the elements themselves. I step through the posted form and for the element recordset use a Scripting.Dictionary populated with instances of a custom Class that has the properties I need, so that I can add the values piecemeal, since they don't always come in order. New elements are added as negative to distinguish them from regular elements (rather than requiring a separate column to indicate if it is new or addresses an existing element). I use regular expression to tear apart the form keys: "^(e|n)([0-9]{1,10})_(a|p|t|c)([0-9]{0,10})$" Then, adding an attribute looks like this. Data.AddNew ElementID.Value = DataID AttrID.Value = Integerize(Matches(0).SubMatches(3)) AttrValue.Value = Request.Form(Key) Data.Update ElementID, AttrID, and AttrValue are references to the fields of the recordset. This method is hugely faster than using Data.Fields("ElementID").Value each time. I loop through the Dictionary of element updates and ignore any that don't have all the proper information, adding the good ones to the recordset. Then I call my data-updating stored procedure like so: Set Cmd = Server.CreateObject("ADODB.Command") With Cmd Set .ActiveConnection = MyDBConn .CommandType = adCmdStoredProc .CommandText = "DataPost" .Prepared = False .Parameters.Append .CreateParameter("@ElementMetadata", adLongVarWChar, adParamInput, 2147483647, XMLFromRecordset(Element)) .Parameters.Append .CreateParameter("@ElementData", adLongVarWChar, adParamInput, 2147483647, XMLFromRecordset(Data)) End With Result.Open Cmd ' previously created recordset object with options set Here's the function that does the xml conversion: Private Function XMLFromRecordset(Recordset) Dim Stream Set Stream = Server.CreateObject("ADODB.Stream") Stream.Open Recordset.Save Stream, adPersistXML Stream.Position = 0 XMLFromRecordset = Stream.ReadText End Function Just in case the web page needs to know, the SP returns a recordset of any new elements, showing their page value and their created value (so I can see that n1 is now e12346 for example). Here are some key snippets from the stored procedure. Note this is SQL 2000 for now, though I'll be able to switch to 2005 soon: CREATE PROCEDURE [dbo].[DataPost] @ElementMetaData ntext, @ElementData ntext AS DECLARE @hdoc int --- snip --- EXEC sp_xml_preparedocument @hdoc OUTPUT, @ElementMetaData, '<xml xmlns:s="uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" xmlns:rs="urn:schemas-microsoft-com:rowset" xmlns:z="#RowsetSchema" />' INSERT #ElementMetadata (ElementID, ElementTypeID, ElementID1, ElementID2) SELECT * FROM OPENXML(@hdoc, '/xml/rs:data/rs:insert/z:row', 0) WITH ( ElementID int, ElementTypeID int, ElementID1 int, ElementID2 int ) ORDER BY ElementID -- orders negative items (new elements) first so they begin counting at 1 for later ID calculation EXEC sp_xml_removedocument @hdoc --- snip --- UPDATE E SET E.ElementTypeID = M.ElementTypeID FROM Element E INNER JOIN #ElementMetadata M ON E.ElementID = M.ElementID WHERE E.ElementID >= 1 AND M.ElementTypeID >= 1 The following query does the correlation of the negative new element ids to the newly inserted ones: UPDATE #ElementMetadata -- Correlate the new ElementIDs with the input rows SET NewElementID = Scope_Identity() - @@RowCount + DataID WHERE ElementID < 0 Other set-based queries do all the other work of validating that the attributes are allowed, are the correct data type, and inserting, updating, and deleting elements and attributes. I hope this brief run-down is useful to others some day! Converting ADO Recordsets to an XML stream was a huge winner for me as it saved all sorts of time and had a namespace and schema already defined that made the results come out correctly. Using a flatter XML format with 2 inputs was also much easier than sticking to some ideal about having everything in a single XML stream.

Read the article
uploading via http post (multipart/form-data) silently fails with big files

- by matteo

When uploading multipart/form-data forms via a http post request to my apache web server, very big files (i.e. 30MB) are silently discarded. On the server side all looks as if the attached file was received with 0 bytes size. On the client side all looks like it had been uploaded succesfully (it takes the expected long time to upload and the browser gives no error message). On the server, nothing is logged into the error log. An entry is logged into the access log as if everything was ok (a post request and a 200 ok response). These uploads are being posted to a php script. In the php script, If I print_r $_FILES, I see the following information for the relevant file: [file5] => Array ( [name] => MOV023.3gp [type] => video/3gpp [tmp_name] => /tmp/phpgOdvYQ [error] => 0 [size] => 0 ) Note both [error] = 0 (which should mean no error) and [size] = 0 (as if the file was empty). My php script runs fine and receives all the rest of the data except these files. move_uploaded_file succeeds on these files and actually copies them as 0byte files. I've already changed the php directives max_upload_size to 50M and post_max_size to 200M, so neither the single file nor the request exceed any size limit. max_execution_time is not relevant, because the time to transfer the data does not count; and I've increased max_input_time to 1000 seconds, though this shouldn't be necessary since this is the time taken to parse the input data, not the time taken to upload it. Is there any apache configuration, prior to php, that could be causing these files to be discarded even prior to php execution? Some limit in size or in upload time? I've read about a default 300 seconds timeout limit, but this should apply to the time the connection is idle, not the time it takes while actually transferring data, right? Needless to say, uploads with all exactly identical conditions (including file format, client and everything) except smaller file size, work seamlessly, so the issue is clearly related to the file or request size, or to the time it takes to send it.

Read the article
What is the right way to process inconsistent data files?

- by Tahabi

I'm working at a company that uses Excel files to store product data, specifically, test results from products before they are shipped out. There are a few thousand spreadsheets with anywhere from 50-100 relevant data points per file. Over the years, the schema for the spreadsheets has changed significantly, but not unidirectionally - in the sense that, changes often get reverted and then re-added in the space of a few dozen to few hundred files. My project is to convert about 8000 of these spreadsheets into a database that can be queried. I'm using MongoDB to deal with the inconsistency in the data, and Python. My question is, what is the "right" or canonical way to deal with the huge variance in my source files? I've written a data structure which stores the data I want for the latest template, which will be the final template used going forward, but that only helps for a few hundred files historically. Brute-forcing a solution would mean writing similar data structures for each version/template - which means potentially writing hundreds of schemas with dozens of fields each. This seems very inefficient, especially when sometimes a change in the template is as little as moving a single line of data one row down or splitting what used to be one data field into two data fields. A slightly more elegant solution I have in mind would be writing schemas for all the variants I can find for pre-defined groups in the source files, and then writing a function to match a particular series of files with a series of variants that matches that set of files. This is because, more often that not, most of the file will remain consistent over a long period, only marred by one or two errant sections, but inside the period, which section is inconsistent, is inconsistent. For example, say a file has four sections with three data fields, which is represented by four Python dictionaries with three keys each. For files 7000-7250, sections 1-3 will be consistent, but section 4 will be shifted one row down. For files 7251-7500, 1-3 are consistent, section 4 is one row down, but a section five appears. For files 7501-7635, sections 1 and 3 will be consistent, but section 2 will have five data fields instead of three, section five disappears, and section 4 is still shifted down one row. For files 7636-7800, section 1 is consistent, section 4 gets shifted back up, section 2 returns to three cells, but section 3 is removed entirely. Files 7800-8000 have everything in order. The proposed function would take the file number and match it to a dictionary representing the data mappings for different variants of each section. For example, a section_four_variants dictionary might have two members, one for the shifted-down version, and one for the normal version, a section_two_variants might have three and five field members, etc. The script would then read the matchings, load the correct mapping, extract the data, and insert it into the database. Is this an accepted/right way to go about solving this problem? Should I structure things differently? I don't know what to search Google for either to see what other solutions might be, though I believe the problem lies in the domain of ETL processing. I also have no formal CS training aside from what I've taught myself over the years. If this is not the right forum for this question, please tell me where to move it, if at all. Any help is most appreciated. Thank you.

Read the article
jstree dynamic JSON data from django

- by danspants

I'm trying to set up jsTree to dynamically accept JSON data from django. This is the test data i have django returning to jstree: result=[{ "data" : "A node", "children" : [ { "data" : "Only child", "state" : "closed" } ], "state" : "open" },"Ajax node"] response=HttpResponse(content=result,mimetype="application/json") this is the jstree code I'm using: jQuery("#demo1").jstree({ "json_data" : { "ajax" : { "url" : "/dirlist", "data" : function (n) { return { id : n.attr ? n.attr("id") : 0 }; }, error: function(e){alert(e);} } }, "plugins" : [ "themes","json_data"] }); All I get is the ajax loading symbol, the ajax error response is also triggered and it alerts "undefined". I've also tried simpleJson encoding in django but with the same result. If I change the url so that it is receiving a JSON file with identical data, it works as expected. Any ideas on what the issue might be?

Read the article
How to filter the jqGrid data NOT using the built in search/filter box

- by Jimbo

I want users to be able to filter grid data without using the intrinsic search box. I have created two input fields for date (from and to) and now need to tell the grid to adopt this as its filter and then to request new data. Forging a server request for grid data (bypassing the grid) and setting the grid's data to be the response data wont work - because as soon as the user tries to re-order the results or change the page etc. the grid will request new data from the server using a blank filter. I cant seem to find grid API to achieve this - does anyone have any ideas? Thanks.

Read the article
Machine learning challenge: diagnosing program in java/groovy (datamining, machine learning)

- by Registered User

Hi All! I'm planning to develop program in Java which will provide diagnosis. The data set is divided into two parts one for training and the other for testing. My program should learn to classify from the training data (BTW which contain answer for 30 questions each in new column, each record in new line the last column will be diagnosis 0 or 1, in the testing part of data diagnosis column will be empty - data set contain about 1000 records) and then make predictions in testing part of data :/ I've never done anything similar so I'll appreciate any advice or information about solution to similar problem. I was thinking about Java Machine Learning Library or Java Data Mining Package but I'm not sure if it's right direction... ? and I'm still not sure how to tackle this challenge... Please advise. All the best!

Read the article
qT quncompress gzip data

- by talei

Hello, I stumble upon a problem, and can't find a solution. So what I want to do is uncompress data in qt, using qUncompress(QByteArray), send from www in gzip format. I used wireshark to determine that this is valid gzip stream, also tested with zip/rar and both can uncompress it. Code so far, is like this: static const char dat[40] = { 0x1f, 0x8b, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x03, 0xaa, 0x2e, 0x2e, 0x49, 0x2c, 0x29, 0x2d, 0xb6, 0x4a, 0x4b, 0xcc, 0x29, 0x4e, 0xad, 0x05, 0x00, 0x00, 0x00, 0xff, 0xff, 0x03, 0x00, 0x2a, 0x63, 0x18, 0xc5, 0x0e, 0x00, 0x00, 0x00 }; //this data contains string: {status:false}, in gzip format QByteArray data; data.append( dat, sizeof(dat) ); unsigned int size = 14; //expected uncompresed size, reconstruct it BigEndianes //prepand expected uncompressed size, last 4 byte in dat 0x0e = 14 QByteArray dataPlusSize; dataPlusSize.append( (unsigned int)((size >> 24) & 0xFF)); dataPlusSize.append( (unsigned int)((size >> 16) & 0xFF)); dataPlusSize.append( (unsigned int)((size >> 8) & 0xFF)); dataPlusSize.append( (unsigned int)((size >> 0) & 0xFF)); QByteArray uncomp = qUncompress( dataPlusSize ); qDebug() << uncomp; And uncompression fails with: qUncompress: Z_DATA_ERROR: Input data is corrupted. AFAIK gzip consist of 10 byte header, DEFLATE peyload, 12 byte trailer ( 8 byte CRC32 + 4 byte ISIZE - uncompresed data size ). Striping header and trailer should leave me with DEFLATE data stream, qUncompress yields same error. I checked with data string compressed in PHP, like this: $stringData = gzcompress( "{status:false}", 1); and qUncompress uncompress that data.(I didn't see and gzip header though i.e. ID1 = 0x1f, ID2 = 0x8b ) I checked above code with debug, and error occurs at: if ( #endif ((BITS(8) << 8) + (hold >> 8)) % 31) { //here is error, WHY? long unsigned int hold = 35615 strm->msg = (char *)"incorrect header check"; state->mode = BAD; break; } inflate.c line 610. I know that qUncompress is simply a wrapper to zlib, so I suppose it should handle gzip without any problem. Any comments are more then welcome. Best regards

Read the article
RDLC - Adding a Data Source in VS2010

- by Kezzer

Greetings. I have an RDLC file and am wanting to add a data source to it, although without any luck so far. The data source is a custom class written by myself (just to add, we do this all the time). We recently converted over to the VS2010 RDLC format which caused some problems, but we've made some changes to our implementation that workaround the more major issues. So, back to the issue at hand, when I attempt to add my data source to the DummyDataSource list in the RDLC view in VS2010 it just does nothing, however it does add the data source to the list of data sources, but you can't select it from the drop-down list in the RDLC view which means I can't add the data source at all. Has anyone come across this problem? Is there anything I need to check? I've searched with fervour and had no luck.

Read the article
Best Practices - Data Annotations vs OnChanging in Entity Framework 4

- by jptacek

I was wondering what the general recommendation is for Entity Framework in terms of data validation. I am relatively new to EF, but it appears there are two main approaches to data validation. The first is to create a partial class for the model, and then perform data validations and update a rule violation collection of some sort. This is outlined at http://msdn.microsoft.com/en-us/library/cc716747.aspx The other is to use data annotations and then have the annotations perform data validation. Scott Guthrie explains this on his blog at http://weblogs.asp.net/scottgu/archive/2010/01/15/asp-net-mvc-2-model-validation.aspx. I was wondering what the benefits are of one over the other. It seems the data annotations would be the preferred mechanism, especially as you move to RIA Services, but I want to ensure I am not missing something. Of course, nothing precludes using both of them together. Thanks John

Read the article
Database design for very large amount of data

- by Hossein

Hi, I am working on a project, involving large amount of data from the delicious website.The data available is at files are "Date,UserId,Url,Tags" (for each bookmark). I normalized my database to a 3NF, and because of the nature of the queries that we wanted to use In combination I came down to 6 tables....The design looks fine, however, now a large amount of data is in the database, most of the queries needs to "join" at least 2 tables together to get the answer, sometimes 3 or 4. At first, we didn't have any performance issues, because for testing matters we haven't had added too much data in the database. No that we have a lot of data, simply joining extremely large tables does take a lot of time and for our project which has to be real-time is a disaster.I was wondering how big companies solve these issues.Looks like normalizing tables just adds complexity, but how does the big company handle large amounts of data in their databases, don't they do the normalization? thanks

Read the article
What's an efficient way of calculating the nearest point?

- by Griffo

I have objects with location data stored in Core Data, I would like to be able to fetch and display just the nearest point to the current location. I'm aware there are formulas which will calculate the distance from current lat/long to a stored lat/long, but I'm curious about the best way to perform this for a set of 1000+ points stored in Core Data. I know I could just return the points from Core Data to an array and then loop through that looking for the min value for distance between the points but I'd imagine there's a more efficient method, possibly leveraging Core Data in some way. Any insight would be appreciated. EDIT: I don't know how I missed this on my initial search but this SO question suggests just iterating through an array of Core Data objects but limiting the array size with a bounding box based on the current location. Is this the best I can do?

Read the article
Need help receiving ByteArray data

- by k80sg

Hi folks, I am trying to receive byte array data from a machine. It sends out 3 different types of data structure each with different number of fields which consist of mostly int and a few floats, and byte sizes, the first being 320 bytes, 420 for the second type and 560 for the third. When the sending program is launched, it fires all 3 types of data simultaneouly with an interval of 1 sec. Example: Sending order: Pack1 - 320 bytes 1 sec later Pack2 - 420 bytes 1 sec later Pack3 - 560 bytes 1 sec later Pack1 - 320 bytes ... .. . How do I check the incoming byte size before passing it to: byte[] handsize = new byte[bytesize]; as the data I receive are all out of order, for instance using the following the read all int: System.out.println("Reading data in int format:" + " " + datainput.readInt()); I get many different sets of values whenever I run my prog although with some valid field data but they are all over the place. I am not too sure how exactly should I do it but I tried the following and apparently my data fields are not receiving in correct sequence: BufferedInputStream bais = new BufferedInputStream(requestSocket.getInputStream()); DataInputStream datainput = new DataInputStream(bais); byte[] handsize = new byte[560]; datainput.readFully(handsize); int n = 0; int intByte[] = new int[140]; for (int i = 0; i < 140 ; i++) { System.out.println("Reading data in int format:" + " " + datainput.readInt()); intByte[n] = datainput.readInt(); n = n + 1; System.out.println("The value in array is:" + intByte[0]); System.out.println("The value in array is:" + intByte[1]); System.out.println("The value in array is:" + intByte[2]); System.out.println("The value in array is:" + intByte[3]); Also from the above code, the order of the values printed out with System.out.println("Reading data in int format:" + " " + datainput.readInt()); and System.out.println("The value in array is:" + intByte[0]); System.out.println("The value in array is:" + intByte[1]); are different. Any help will be appreciated. Thanks

Read the article
C# Winforms ADO.NET - DataGridView INSERT starting with null data

- by Geo Ego

I have a C# Winforms app that is connecting to a SQL Server 2005 database. The form I am currently working on allows a user to enter a large amount of different types of data into various textboxes, comboboxes, and a DataGridView to insert into the database. It all represents one particular type of machine, and the data is spread out over about nine tables. The problem I have is that my DataGridView represents a type of data that may or may not be added to the database. Thus, when the DataGridView is created, it is empty and not databound, and so data cannot be entered. My question is, should I create the table with hard-coded field names representing the way that the data looks in the database, or is there a way to simply have the column names populate with no data so that the user can enter it if they like? I don't like the idea of hard-coding them in case there is a change in the database schema, but I'm not sure how else to deal with this problem.

Read the article
Memory efficient import many data files into panda DataFrame in Python

- by richardh

I import into a panda DataFrame a directory of |-delimited.dat files. The following code works, but I eventually run out of RAM with a MemoryError:. import pandas as pd import glob temp = [] dataDir = 'C:/users/richard/research/data/edgar/masterfiles' for dataFile in glob.glob(dataDir + '/master_*.dat'): print dataFile temp.append(pd.read_table(dataFile, delimiter='|', header=0)) masterAll = pd.concat(temp) Is there a more memory efficient approach? Or should I go whole hog to a database? (I will move to a database eventually, but I am baby stepping my move to pandas.) Thanks! FWIW, here is the head of an example .dat file: cik|cname|ftype|date|fileloc 1000032|BINCH JAMES G|4|2011-03-08|edgar/data/1000032/0001181431-11-016512.txt 1000045|NICHOLAS FINANCIAL INC|10-Q|2011-02-11|edgar/data/1000045/0001193125-11-031933.txt 1000045|NICHOLAS FINANCIAL INC|8-K|2011-01-11|edgar/data/1000045/0001193125-11-005531.txt 1000045|NICHOLAS FINANCIAL INC|8-K|2011-01-27|edgar/data/1000045/0001193125-11-015631.txt 1000045|NICHOLAS FINANCIAL INC|SC 13G/A|2011-02-14|edgar/data/1000045/0000929638-11-00151.txt

Read the article
C# SqlBulkCopy and Data Entities

- by KP

Guys, My current project consists of 3 standard layers: data, business, and presentation. I would like to use data entities for all my data access needs. Part of the functionality of the app will that it will need to copy all data within a flat file into a database. The file is not so big so I can use SqlBulkCopy. I have found several articles regarding the usage of SqlBulkCopy class in .NET. However, all the articles are using DataTables to move data back and forth. Is there a way to use data entities along with SqlBulkCopy or will I have to use DataTables?

Read the article
Providing dynamic data to webpage

- by Marius

Hi, I have a web page that displays dynamic data which changes every 2 seconds. Data is selected from various data sources including Oracle. Currently, the page reloads every 10 seconds and runs a PHP script which retrieves the data and displays the page. I have other pages that gives a different view on the same data. This means the same query is run again for them as well. If I have 4 of these pages with 10 concurrent users each, suddenly the data retrieval happens 40 times every 10 seconds, obviously not ideal. I have some ideas on how to improve this situation, but I thought I would ask from some ideas from other experts that might have come across a similar situation. I'm not bound to PHP, and my server is on a Linux platform. Regards Marius

Read the article
Providing dynamic data to webpage

- by Marius

Hi, I have a web page that displays dynamic data which changes every 2 seconds. Data is selected from various data sources including Oracle. Currently, the page reloads every 10 seconds and runs a PHP script which retrieves the data and displays the page. I have other pages that gives a different view on the same data. This means the same query is run again for them as well. If I have 4 of these pages with 10 concurrent users each, suddenly the data retrieval happens 40 times every 10 seconds, obviously not ideal. I have some ideas on how to improve this situation, but I thought I would ask from some ideas from other experts that might have come across a similar situation. I'm not bound to PHP, and my server is on a Linux platform. Regards Marius

Read the article
How to use json object notation to retrieve dbpedia json data

- by Margi

In my php code, I am retrieving json data as below. <?php $url = "http://dbpedia.org/data/Los_Angeles.json"; $data = file_get_contents($url); echo $data; ?> The javascript code consumes this json data returned from php and gets the json object as below. var doc = eval('(' + request.responseText + ')'); How to retrieve the following json data using dot notations. The keys contain URLs. "http://dbpedia.org/ontology/populationTotal" : [ { "type" : "literal", "value" : 3792621 , "datatype" : "http://www.w3.org/2001/XMLSchema#integer" } ] , "http://dbpedia.org/ontology/PopulatedPlace/areaTotal" : [ { "type" : "literal", "value" : "1301.9688931491348" , "datatype" : "http://dbpedia.org/datatype/squareKilometre" }

Read the article
Write data to an xml file.

- by Bobby

My aim is to write an XML file with few tags whose values are the regional language. I'm using Python to do this and using IDLE(Pythong GUI) for programming. While I try to write the words in an xmls file it gives the following error. UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-4: ordinal not in range(128) For now, I'm not using any xml writer library instead I'm opening a file "test.xml" and writing the data into it. This error is encountered by the line: f.write("\t\" + data + "\"). If I replace the above write statement with print statement then it prints the data properly on Python Shell. I'm reading the data from an excel file which is not in the UTF8,16, or 32 encoding formats. Its in some other format. cp1252 is reading the data properly. Any help in getting this data writtent oan xml would be highly appreciated. Thanks in advance. Bob

Read the article
ItemFileWriteStore: how to change the data?

- by jeff porter

Hi, I'd like to change the data in my dojo.data.ItemFileWriteStore. Currently I have... var rawdataCacheItems = [{'cachereq':'14:52:03','name':'Test1','cacheID':'3','ver':'7'}]; var cacheInfo = new dojo.data.ItemFileWriteStore({ data: { identifier: 'cacheID', label: 'cacheID', items: rawdataCacheItems } }); I'd like to make a XHR request to get a new JSON string of the data to render. What I can't work out is how to change the data in the "items" held by ItemFileWriteStore. Can anyone point me in the correct direction? Thanks Jeff Porter

Read the article
selectors-api for data attributes

- by MJ

In HTML5, CSS selectors seem to operate well with data-* attributes. For example: <style> div[data-foo='bar'] { background:#eee; } </style> <div data-foo='bar'>colored</div> <div>not colored</div> will properly style the first . But, attempts to select such elements using the selectors-api fail. Examples: var foos = document.querySelectorAll("div[data-foo]='bar'"); or var foos = document.querySelectorAll("div data-foo='bar'"); in Chrome and Safari, this produces a cryptic error: SYNTAX_ERR: DOM Exception 12 Any thoughts on how to use the selectors-api to properly select elements on the basis of data-* attributes?

Read the article
DB2 load partitioned data in parallel

- by Erik Paulson

I have a 10-node DB2 9.5 database, with raw data on each machine (ie node1:/scratch/data/dataset.1 node2:/scratch/data/dataset.2 ... node10:/scratch/data/dataset.10 There is no shared NFS mount - none of my machines could handle all of the datasets combined. each line of a dataset file is a long string of text, column delimited. The first column is the key. I don't know the hash function that DB2 will use, so dataset is not pre-partitioned. Short of renaming all of my files, is there any way to get DB2 to load them all in parallel? I'm trying to do something like load from '/scratch/data/dataset' of del modified by coldel| fastparse messages /dev/null replace into TESTDB.data_table part_file_location '/scratch/data'; but I have no idea how to suggest to db2 that it should look for dataset.1 on the first node, etc.

Read the article
DataSetProvider - DataSet to ClientDataSet

- by nomad311

I am trying to take data from a TMSQuery which is connected to my DB and populate a ClientDataSet with some of that data using a DataSetProvider. My problem is that I will need to modify some of this data before it can go into my ClientDataSet. The ClientDataSet has persistent fields that will not match up with the raw DB data. I can't even get a string from the DB into a memo field in ClientDataSet. The ClientDataSet is a part of my data tier so I will need to conform the data from the DB to the ClientDataSet field by field (well most will be able to go right through, but many will require routing and/or conversion). Does anyone have experience with this?

Read the article
Access MP3 audio data independently of ID3 tags?

- by kyl191

Hi, this is a 2 part question. First off, is it possible to access the audio data in an MP3 independently of the ID3 tags, and secondly, is there any way to do so using available libraries? I recently consolidated my music collection from 3 computers and ended up with songs which had changed ID3 tags, but the audio data itself was unmodified. Running a search for duplicate files failed because the file changed with the ID3 tag change, but I think it should be possible to identify duplicate files if I just run a deduplication using the audio data for comparison. I know that it's possible to seek to a particular position past the ID3 header in the file, and directly read the data, but was wondering if there's a library that would expose the audio data so I could just extract the data, run a checksum on it, and store the computed result somewhere, then look for identical checksums. (Also, I'd probably have to use some kind of library when you take into account variable length headers.)

Read the article
data directory in automake

- by Alex Farber

I have some data files that should be distributed with my program. Using dist_pkgdata_DATA in Makefile.am, I get these files installed to /usr/local/data/share/package-name. The problem is that data is read-only, and my program needs to modify it. Playing with dist_sharedstate_DATA, dist_localstate_DATA, dist-data_DATA varibles, I got different installation directories, like /usr/local/com, usr/local/var, but data is always read-only. How can I distribute modifiable data files with my package? I need some common directory for all users, or maybe local data in a user directory.

Read the article

Search Results

Search found 58499 results on 2340 pages for 'temporal data'.

Page 151/2340 | < Previous Page | 147 148 149 150 151 152 153 154 155 156 157 158 | Next Page >

- by Emtucifor

- by matteo

- by Tahabi

- by danspants

- by Jimbo

- by Registered User

- by talei

- by Kezzer

- by jptacek

- by Hossein

- by Griffo

- by k80sg

- by Geo Ego

- by richardh

- by KP

- by Marius

- by Marius

- by Margi

- by Bobby

- by jeff porter

- by MJ

- by Erik Paulson

- by nomad311

- by kyl191

- by Alex Farber

< Previous Page | 147 148 149 150 151 152 153 154 155 156 157 158 | Next Page >