Search Results

Search found 118 results on 5 pages for 'morpheous'.

Page 2/5 | < Previous Page | 1 2 3 4 5  | Next Page >

  • Database warehouse design: fact tables and dimension tables

    - by morpheous
    I am building a poor man's data warehouse using a RDBMS. I have identified the key 'attributes' to be recorded as: sex (true/false) demographic classification (A, B, C etc) place of birth date of birth weight (recorded daily): The fact that is being recorded My requirements are to be able to run 'OLAP' queries that allow me to: 'slice and dice' 'drill up/down' the data and generally, be able to view the data from different perspectives After reading up on this topic area, the general consensus seems to be that this is best implemented using dimension tables rather than normalized tables. Assuming that this assertion is true (i.e. the solution is best implemented using fact and dimension tables), I would like to seek some help in the design of these tables. 'Natural' (or obvious) dimensions are: Date dimension Geographical location Which have hierarchical attributes. However, I am struggling with how to model the following fields: sex (true/false) demographic classification (A, B, C etc) The reason I am struggling with these fields is that: They have no obvious hierarchical attributes which will aid aggregation (AFAIA) - which suggest they should be in a fact table They are mostly static or very rarely change - which suggests they should be in a dimension table. Maybe the heuristic I am using above is too crude? I will give some examples on the type of analysis I would like to carryout on the data warehouse - hopefully that will clarify things further. I would like to aggregate and analyze the data by sex and demographic classification - e.g. answer questions like: How does male and female weights compare across different demographic classifications? Which demographic classification (male AND female), show the most increase in weight this quarter. etc. Can anyone clarify whether sex and demographic classification are part of the fact table, or whether they are (as I suspect) dimension tables.? Also assuming they are dimension tables, could someone elaborate on the table structures (i.e. the fields)? The 'obvious' schema: CREATE TABLE sex_type (is_male int); CREATE TABLE demographic_category (id int, name varchar(4)); may not be the correct one.

    Read the article

  • Searching for specific HTML string using Python

    - by Morpheous
    What modules would be the best to write a python program that searches through hundreds of html documents and deletes a certain string of html that is given. For instance, if I have an html doc that has <a href="test.html">Test</a> and I want to delete this out of every html page that has it. Any help is much appreciated, and I don't need someone to write the program for me, just a helpful point in the right direction.

    Read the article

  • Database warehoue design: fact tables and dimension tables

    - by morpheous
    I am building a poor man's data warehouse using a RDBMS. I have identified the key 'attributes' to be recorded as: sex (true/false) demographic classification (A, B, C etc) place of birth date of birth weight (recorded daily): The fact that is being recorded My requirements are to be able to run 'OLAP' queries that allow me to: 'slice and dice' 'drill up/down' the data and generally, be able to view the data from different perspectives After reading up on this topic area, the general consensus seems to be that this is best implemented using dimension tables rather than normalized tables. Assuming that this assertion is true (i.e. the solution is best implemented using fact and dimension tables), I would like to see some help in the design of these tables. 'Natural' (or obvious) dimensions are: Date dimension Geographical location Which have hierarchical attributes. However, I am struggling with how to model the following fields: sex (true/false) demographic classification (A, B, C etc) The reason I am struggling with these fields is that: They have no obvious hierarchical attributes which will aid aggregation (AFAIA) - which suggest they should be in a fact table They are mostly static or very rarely change - which suggests they should be in a dimension table. Maybe the heuristic I am using above is too crude? I will give some examples on the type of analysis I would like to carryout on the data warehouse - hopefully that will clarify things further. I would like to aggregate and analyze the data by sex and demographic classification - e.g. answer questions like: How does male and female weights compare across different demographic classifications? Which demographic classification (male AND female), show the most increase in weight this quarter. etc. Can anyone clarify whether sex and demographic classification are part of the fact table, or whether they are (as I suspect) dimension tables.? Also assuming they are dimension tables, could someone elaborate on the table structures (i.e. the fields)? The 'obvious' schema: CREATE TABLE sex_type (is_male int); CREATE TABLE demographic_category (id int, name varchar(4)); may not be the correct one.

    Read the article

  • Can a PL/pgSQL function contain a dynamic subquery?

    - by morpheous
    I am writing a PL/pgSQL function. The function has input parameters which specify (indirectly), which tables to read filtering information from. The function embeds business logic which allows it to select data from different tables based on the input arguments. The function dynamically builds a subquery which returns filtering data which is then used to run the main query. My questions are: Is it 'legal' to use a dynamic subquery in a PL/pgSQL function. I cant see why not - but this question is related to the next one. AFAIK, PL/pgSQL are cached or precompiled by the query engine. How does having a function that generates dynamic subqueries impact the work of the query engine?

    Read the article

  • How to dump memcache contents to console in a telnet session

    - by morpheous
    I am having problems with memcached, and I need to find what is actually stored in the cache. I have telnet'ed into the daemon and am listening on the port, and can send the daemon commands via the CLI. I want to know if anyone is aware of a command I can use to dump the contents of the cache to the console. I know of commands like 'flush_all' and 'stats' but those are not quite sufficient for what I want to do.

    Read the article

  • Paypal subscription API - transaction variables not POSTed

    - by morpheous
    I am writing a payments system based around paypal, I am using the HTML 'API'. I am passing the following form fields to PayPal: 'rm' = 2 'return = http://www.example.com/payment-handler.php?token=sometoken Where 'token' is a token I generated. According to the paypal documentation, a return method (rm) of 2 indicates to Paypal that the transaction data be posted back to the callback url using the POST method. When processing items using 'buy_now' buttons, the transaction items are correctly POSTed to my callback url (payment-handler.php), but for 'subscribe' operations, although the callback url is called, no POST data is sent to the url, and also, the 'token' field is missing. Instead, there is a parameter called 'auth'. I cant see anything in the paypal docs about a 'auth' field - so I dont know whats generating it and if I can reliably using it. Can anyone shed some light on this?

    Read the article

  • Python form POST using urllib2 (also question on saving/using cookies)

    - by morpheous
    I am trying to write a function to post form data and save returned cookie info in a file so that the next time the page is visited, the cookie information is sent to the server (i.e. normal browser behavior). I wrote this relatively easily in C++ using curlib, but have spent almost an entire day trying to write this in Python, using urllib2 - and still no success. This is what I have so far: import urllib, urllib2 import logging # the path and filename to save your cookies in COOKIEFILE = 'cookies.lwp' cj = None ClientCookie = None cookielib = None logger = logging.getLogger(__name__) # Let's see if cookielib is available try: import cookielib except ImportError: logger.debug('importing cookielib failed. Trying ClientCookie') try: import ClientCookie except ImportError: logger.debug('ClientCookie isn\'t available either') urlopen = urllib2.urlopen Request = urllib2.Request else: logger.debug('imported ClientCookie succesfully') urlopen = ClientCookie.urlopen Request = ClientCookie.Request cj = ClientCookie.LWPCookieJar() else: logger.debug('Successfully imported cookielib') urlopen = urllib2.urlopen Request = urllib2.Request # This is a subclass of FileCookieJar # that has useful load and save methods cj = cookielib.LWPCookieJar() login_params = {'name': 'anon', 'password': 'pass' } def login(theurl, login_params): init_cookies(); data = urllib.urlencode(login_params) txheaders = {'User-agent' : 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'} try: # create a request object req = Request(theurl, data, txheaders) # and open it to return a handle on the url handle = urlopen(req) except IOError, e: log.debug('Failed to open "%s".' % theurl) if hasattr(e, 'code'): log.debug('Failed with error code - %s.' % e.code) elif hasattr(e, 'reason'): log.debug("The error object has the following 'reason' attribute :"+e.reason) sys.exit() else: if cj is None: log.debug('We don\'t have a cookie library available - sorry.') else: print 'These are the cookies we have received so far :' for index, cookie in enumerate(cj): print index, ' : ', cookie # save the cookies again cj.save(COOKIEFILE) #return the data return handle.read() # FIXME: I need to fix this so that it takes into account any cookie data we may have stored def get_page(*args, **query): if len(args) != 1: raise ValueError( "post_page() takes exactly 1 argument (%d given)" % len(args) ) url = args[0] query = urllib.urlencode(list(query.iteritems())) if not url.endswith('/') and query: url += '/' if query: url += "?" + query resource = urllib.urlopen(url) logger.debug('GET url "%s" => "%s", code %d' % (url, resource.url, resource.code)) return resource.read() When I attempt to log in, I pass the correct username and pwd,. yet the login fails, and no cookie data is saved. My two questions are: can anyone see whats wrong with the login() function, and how may I fix it? how may I modify the get_page() function to make use of any cookie info I have saved ?

    Read the article

  • Using multiple aggregate functions in an (ANSI) SQL statement

    - by morpheous
    I have aggregate functions foo(), foobar(), fredstats(), barneystats() I want to create a domain specific query language (DSQL) above my DB, to facilitate using a domain language to query the DB. The 'language' comprises of algebraic expressions (or more specifically SQL like criteria) which I use to generate (ANSI) SQL statements which are sent to the db engine. The following lines are examples of what the language statements will look like, and hopefully, it will help further clarify the concept: **Example 1** DQL statement: foobar('yellow') between 1 and 3 and fredstats('weight') > 42 Translation: fetch all rows in an underlying table where computed values for aggregate function foobar() is between 1 and 3 AND computed value for AGG FUNC fredstats() is greater than 42 **Example 2** DQL statement: fredstats('weight') < barneystats('weight') AND foo('fighter') in (9,10,11) AND foobar('green') <> 42 Translation: Fetch all rows where the specified criteria matches **Example 3** DQL statement: foobar('green') / foobar('red') <> 42 Translation: Fetch all rows where the specified criteria matches **Example 4** DQL statement: foobar('green') - foobar('red') >= 42 Translation: Fetch all rows where the specified criteria matches Given the following information: The table upon which the queries above are being executed is called 'tbl' table 'tbl' has the following structure (id int, name varchar(32), weight float) The result set returns only the tbl.id, tbl.name and the names of the aggregate functions as columns in the result set - so for example the foobar() AGG FUNC column will be called foobar in the result set. So for example, the first DQL query will return a result set with the following columns: id, name, foobar, fredstats Given the above, my questions then are: What would be the underlying SQL required for Example1 ? What would be the underlying SQL required for Example3 ? Given an algebraic equation comprising of AGGREGATE functions, Is there a way of generalizing the algorithm needed to generate the required ANSI SQL statement(s)? I am using PostgreSQL as the db, but I would prefer to use ANSI SQL wherever possible.

    Read the article

  • PostgreSQL server not showing up in phpPgAdmin

    - by morpheous
    I have installed both phpPgAdmin and pgAdmin III on my Ubuntu 9.10 dev box. I created a server using pgAdmin III, and then added a database and populated it. I then navigated to phpPgAdmin, expecting to be able to see the server and be able to log on to the server using the postgres account. However, the only server shown 'PostgreSQL' (and it had a red cross icon). When I attempted to login using the postgres account, the following message was displayed: Login disallowed for security reasons. I have the following questions: Where is the server and database I created (they are still visible when I use pgAdmin III) How may I add another user to the server and give it access to a database?

    Read the article

  • Plotting 3-tuple data points in a surface / contour plot using matplotlib

    - by morpheous
    I have some surface data that is generated by an external program as XYZ values. I want to create the following graphs, using matplotlib: Surface plot Contour plot Contour plot overlayed with a surface plot I have looked at several examples for plotting surfaces and contours in matplotlib - however, the Z values seems to be a function of X and Y i.e. Y ~ f(X,Y). I assume that I will somehow need to transform my Y variables, but I have not seen any example yet, that shows how to do this. So, my question is this: given a set of (X,Y,Z) points, how may I generate Surface and contour plots from that data? BTW, just to clarify, I do NOT want to create scatter plots. Also although I mentioned matplotlib in the title, I am not averse to using rpy(2), if that will allow me to create these charts.

    Read the article

  • A good matplot tutorial (from beginner to intermidiate)?

    - by morpheous
    Can anyone recommend a good matplot tutorial. I am a complete beginner - but have used similar software (matlab, R etc), in my halcyon days at University (i.e. a long time ago). A google search brings up a list of dubious quality, and the 'official' docs are too terse, or provide examples that are more 'edge case' (e.g. drawing dolphins swimming in a bubble), than one is likely to meet in practise. I want a manual that provides the following information in a well structured manner: Introduction to the data types Introduction to 2D plotting with some simple practical examples (simple 2D graphs) Introduction to 3D plotting with some simple practical examples (simple 2D graphs: contour and surface)

    Read the article

  • Which SEO practises are likely to be responsible for SO questions appearing so quickly in Google sea

    - by morpheous
    Does anyone have some idea as to how come questions posted here on SO are showing up so quickly on Google?. Sometimes questions submitted are appearing as the first 10 entries or so - on the first page within 30 minutes of submitting a question. Pray tell, what sort of magic is being wielded here? Anybody have some ideas, suggestions?. My first thought is that they have info in their sitemap that tells google robots to trawl every N minutes or so - is that whats going on? BTW, I am aware that simply instructing Googlebots to scan your site every N minutes will not work if you dont have quality information (that is constantly being updated on your site). I'd just like to know if there is something else that SO may be doing right (apart from the marvelous content of course)

    Read the article

  • Is there the equivalent of cloud computing for modems?

    - by morpheous
    I asked this question on SF, and someone recommended that I ask it here - (I don't think I have enough points to move a question from SF to SO - and in any case, I don't know how to do it - so here is the question again): I am interested in the concept of PAAS (platform as a service). However, all talk about SAAS/PAAS seems to focus on only the computer itself - not its peripherals. Is it possible to 'outsource' modems as a resource - so that an app running remotely can pump data to a modem in the cloud? As a bit of background to the question, a group of us are thinking of starting a company that offers similar services to companies like twilio etc - but I want to 'outsource' both the computing hardware (thats PAAS - the easy bit) and the modems (thats what I cant seem to find any info on). Does anyone know if modems can be bundled as part of a PAAS service? - alternatively, is there a way that an application running on one computer can communicate (i.e. pump data) to a remote modem residing on another machine?. I assume I can come up with some protocol over UDP or TCP - but there is no point reinventing the wheel - if such a protocol like that already exists (or if it some open source software allows one to do this). Any suggestions on how to solve this problem?

    Read the article

  • How to write a custom solution using a python package, modules etc

    - by morpheous
    I am writing a packacge foobar which consists of the modules alice, bob, charles and david. From my understanding of Python packages and modules, this means I will create a folder foobar, with the following subdirectories and files (please correct if I am wrong) foobar/ __init__.py alice/alice.py bob/bob.py charles/charles.py david/david.py The package should be executable, so that in addition to making the modules alice, bob etc available as 'libraries', I should also be able to use foobar in a script like this: python foobar --args=someargs Question1: Can a package be made executable and used in a script like I described above? Question 2 The various modules will use code that I want to refactor into a common library. Does that mean creating a new sub directory 'foobar/common' and placing common.py in that folder? Question 3 How will the modules foo import the common module ? Is it 'from foobar import common' or can I not use this since these modules are part of the package? Question 4 I want to add logic for when the foobar package is being used in a script (assuming this can be done - I have only seen it done for modules) The code used is something like: if __name__ == "__main__": dosomething() where (in which file) would I put this logic ?

    Read the article

  • Debugging into a shared library source from consuming app, using QTCreator

    - by morpheous
    I am using QTCreator (1.3.1) on Ubuntu Karmic. I have built two projects: a shared library an application that links to the shared library I am debugging the application, and need to step into the implementation (i.e. the source) of one of the functions exported by the shared library. Does anyone know how to setup the QTCreator to allow me to step into the source of a shared library?

    Read the article

  • Using multiple aggregate functions in an algebraic expression in (ANSI) SQL statement

    - by morpheous
    I have the following aggregate functions (AGG FUNCs): foo(), foobar(), fredstats(), barneystats(). I want to know if I can use multiple AGG FUNCs in an algebraic expression. This may seem a strange/simplistic question for seasoned SQL developers - however, the but the reason I ask is that so far, all AGG FUNCs examples I have seen are of the simplistic variety e.g. max(salary) < 100, rather than using the AGG FUNCs in an expression which involves using multiple AGG FUNCs in an expression (like agg_func1() agg_func2()). The information below should help clarify further. Given tables with the following schemas: CREATE TABLE item (id int, length float, weight float); CREATE TABLE item_info (item_id, name varchar(32)); # Is it legal (ANSI) SQL to write queries of this format ? SELECT id, name, foo, foobar, fredstats FROM A, B (SELECT id, foo(123) as foo, foobar('red') as foobar, fredstats('weight') as fredstats FROM item GROUP BY id HAVING [ALGEBRAIC EXPRESSION] ORDER BY id AS A), item_info AS B WHERE item.id = B.id Where: ALGEBRAIC EXPRESSION is the type of expression that can be used in a WHERE clause - for example: ((foo(x) < foobar(y)) AND foobar(y) IN (1,2,3)) OR (fredstats(x) <> 0)) I am using PostgreSQL as the db, but I would prefer to use ANSI SQL wherever possible. Assuming it is legal to include AGG FUNCS in the way I have done above, I'd like to know: Is there a more efficient way to write the above query ? Is there any way I can speed up the query in terms of a judicious choice of indexes on the tables item and item_info ? Is there a performance hit of using AGG FUNCs in an algebraic expression like I am (i.e. an expression involving the output of aggregate functions rather than constants? Can the expression also include 'scaled' AGG FUNC? (for example: 2*foo(123) < -3*foobar(456) ) - will scaling (i.e. multiplying an AGG FUNC by a number have an effect on performance?) How can I write the query above using INNER JOINS instead?

    Read the article

  • Combining aggregate functions in an (ANSI) SQL statement

    - by morpheous
    I have aggregate functions foo(), foobar(), fredstats(), barneystats() I want to create a domain specific query language (DSQL) above my DB, to facilitate using using a domain language to query the DB. The 'language' comprises of boolean expressions (or more specifically SQL like criteria) which I then 'translate' back into pure (ANSI) SQL and send to the underlying Db. The following lines are examples of what the language statements will look like, and hopefully, it will help further clarify the concept: **Example 1** DQL statement: foobar('yellow') between 1 and 3 and fredstats('weight') > 42 Translation: fetch all rows in an underlying table where computed values for aggregate function foobar() is between 1 and 3 AND computed value for AGG FUNC fredstats() is greater than 42 **Example 2** DQL statement: fredstats('weight') < barneystats('weight') AND foo('fighter') in (9,10,11) AND foobar('green') <> 42 Translation: Fetch all rows where the specified criteria matches **Example 3** DQL statement: foobar('green') / foobar('red') <> 42 Translation: Fetch all rows where the specified criteria matches **Example 4** DQL statement: foobar('green') - foobar('red') >= 42 Translation: Fetch all rows where the specified criteria matches Given the following information: The table upon which the queries above are being executed is called 'tbl' table 'tbl' has the following structure (id int, name varchar(32), weight float) The result set returns only the tbl.id, tbl.name and the names of the aggregate functions as columns in the result set - so for example the foobar() AGG FUNC column will be called foobar in the result set. So for example, the first DQL query will return a result set with the following columns: id, name, foobar, fredstats Given the above, my questions then are: What would be the underlying SQL required for Example1 ? What would be the underlying SQL required for Example3 ? Given an algebraic equation comprising of AGGREGATE functions, Is there a way of generalizing the algorithm needed to generate the required ANSI SQL statement(s)? I am using PostgreSQL as the db, but I would prefer to use ANSI SQL wherever possible.

    Read the article

  • creating PHP C/C++ extension modules using SWIG

    - by morpheous
    I have written some C/C++ extension modules for PHP, using the 'old fashioned way' - i.e. by using the manual way (as described by Sarah Golemon in her book). This is too fiddly for me, and since I am lazy, and would like to automate as much as possible. Also, I have used SWIG now to generate extensions to Python, and I am getting to like using it quite a lot. I am thinking of using SWIG to generate my future PHP extensions. I am using PHP v5.2 (and above) on my production servers. My questions are: Is SWIG PHP interface stable yet (i.e. ready for production)? If you answered yes to question 1 -are YOU using it in YOUR production site? Are there any 'gotchas' I need to be aware of when creating PHP extension ,modules using SWIG?

    Read the article

  • How to calculate change in ANSI SQL

    - by morpheous
    I have a table that contains sales data. The data is stored in a table that looks like this: CREATE table sales_data ( sales_time timestamp , sales_amt double ) I need to write parameterized queries that will allow me to do the following: Return the change in sales_amt between times t2 and t1, where t2 and t1 are separated by a time interval (integer) of N. This query will allow for querying for weekly changes in sales (for example). Return the change in change of sales_amt between times t2 and t1, and time t3 and t4. Thats is to calculate the value (val(t2)-val(t1)) - (val(t4)-val(t3)). where t2 and t1 are separated by the same time interval (interval N) as the interval between t4 and t3. This query will allow for querying for changes in weekly changes in sales (for example).

    Read the article

  • Web page database query optimization

    - by morpheous
    I am putting together a web page which is quite 'expensive' in terms of database hits. I don't want to start optimizing at this stage - though with me trying to hit a deadline, I may end up not optimizing at all. Currently the page requires 18 (that's right eighteen) hits to the db. I am already using joins, and some of the queries are UNIONed to minimize the trips to the db. My local dev machine can handle this (page is not slow) however, I feel if I release this into the wild, the number of queries will quickly overwhelm my database (MySQL). I could always use memcache or something similar, but I would much rather continue with my other dev work that needs to be completed before the deadline - at least retrieving the page works - its simply a matter of optimization now (if required). My question therefore is - is 18 db queries for a single page retrieval completely outrageous - (i.e. I should put everything on hold and optimize the hell of the retrieval logic), or shall I continue as normal, meet the deadline and release on schedule and see what happens? [Edit] Just to clarify, I have already done the 'obvious' things like using (single and composite) indexes for fields used in the queries. What I haven't yet done is to run a query analyzer to see if my indexes etc are optimal.

    Read the article

< Previous Page | 1 2 3 4 5  | Next Page >