Search Results

Search found 14260 results on 571 pages for 'regex group'.

Page 124/571 | < Previous Page | 120 121 122 123 124 125 126 127 128 129 130 131 | Next Page >

Get percentiles of data-set with group by month

- by Cylindric

Hello, I have a SQL table with a whole load of records that look like this: | Date | Score | + -----------+-------+ | 01/01/2010 | 4 | | 02/01/2010 | 6 | | 03/01/2010 | 10 | ... | 16/03/2010 | 2 | I'm plotting this on a chart, so I get a nice line across the graph indicating score-over-time. Lovely. Now, what I need to do is include the average score on the chart, so we can see how that changes over time, so I can simply add this to the mix: SELECT YEAR(SCOREDATE) 'Year', MONTH(SCOREDATE) 'Month', MIN(SCORE) MinScore, AVG(SCORE) AverageScore, MAX(SCORE) MaxScore FROM SCORES GROUP BY YEAR(SCOREDATE), MONTH(SCOREDATE) ORDER BY YEAR(SCOREDATE), MONTH(SCOREDATE) That's no problem so far. The problem is, how can I easily calculate the percentiles at each time-period? I'm not sure that's the correct phrase. What I need in total is: A line on the chart for the score (easy) A line on the chart for the average (easy) A line on the chart showing the band that 95% of the scores occupy (stumped) It's the third one that I don't get. I need to calculate the 5% percentile figures, which I can do singly: SELECT MAX(SubQ.SCORE) FROM (SELECT TOP 45 PERCENT SCORE FROM SCORES WHERE YEAR(SCOREDATE) = 2010 AND MONTH(SCOREDATE) = 1 ORDER BY SCORE ASC) AS SubQ SELECT MIN(SubQ.SCORE) FROM (SELECT TOP 45 PERCENT SCORE FROM SCORES WHERE YEAR(SCOREDATE) = 2010 AND MONTH(SCOREDATE) = 1 ORDER BY SCORE DESC) AS SubQ But I can't work out how to get a table of all the months. | Date | Average | 45% | 55% | + -----------+---------+-----+-----+ | 01/01/2010 | 13 | 11 | 15 | | 02/01/2010 | 10 | 8 | 12 | | 03/01/2010 | 5 | 4 | 10 | ... | 16/03/2010 | 7 | 7 | 9 | At the moment I'm going to have to load this lot up into my app, and calculate the figures myself. Or run a larger number of individual queries and collate the results.

Read the article
Backreferences syntax in replacement strings (why dollar sign?)

- by polygenelubricants

In Java, and it seems in a few other languages, backreferences in the pattern is preceded by a slash (e.g. \1, \2, \3, etc), but in a replacement string it's preceded by a dollar sign (e.g. $1, $2, $3, and also $0). Here's a snippet to illustrate: System.out.println( "left-right".replaceAll("(.*)-(.*)", "\\2-\\1") // WRONG!!! ); // prints "2-1" System.out.println( "left-right".replaceAll("(.*)-(.*)", "$2-$1") // CORRECT! ); // prints "right-left" System.out.println( "You want million dollar?!?".replaceAll("(\\w*) dollar", "US\\$ $1") ); // prints "You want US$ million?!?" System.out.println( "You want million dollar?!?".replaceAll("(\\w*) dollar", "US$ \\1") ); // throws IllegalArgumentException: Illegal group reference Questions: Is the use of $ for backreferences in replacement strings unique to Java? If not, what language started it? What flavors use it and what don't? Why is this a good idea? Why not stick to the same pattern syntax? Wouldn't that lead to a more cohesive and an easier to learn language? Wouldn't the syntax be more streamlined if statements 1 and 4 in the above were the "correct" ones instead of 2 and 3?

Read the article
SQL group results as a column array

- by Radek

Hi guys, this is an SQL question and don't know which type of JOIN, GROUP BY etc. to use, it is for a chat program where messages are related to rooms and each day in a room is linked to a transcript etc. Basically, when outputting my transcripts, I need to show which users have chatted on that transcript. At the moment I link them through the messages like so: SELECT rooms.id, rooms.name, niceDate, room_transcripts.date, long FROM room_transcripts JOIN rooms ON room_transcripts.room=rooms.id JOIN transcript_users ON transcript_users.room=rooms.id AND transcript_users.date=room_transcripts.date JOIN users ON transcript_users.user=users.id WHERE room_transcripts.deleted=0 AND rooms.id IN (1,2) ORDER BY room_transcripts.id DESC, long ASC The result set looks like this: Array ( [0] => Array ( [id] => 2 [name] => Room 2 [niceDate] => Wednesday, April 14 [date] => 2010-04-14 [long] => Jerry Seinfeld ) [1] => Array ( [id] => 1 [name] => Room 1 [niceDate] => Wednesday, April 14 [date] => 2010-04-14 [long] => Jerry Seinfeld ) [2] => Array ( [id] => 1 [name] => Room 1 [niceDate] => Wednesday, April 14 [date] => 2010-04-14 [long] => Test Users ) ) I would like though for each element in the array to represent one transcript entry and for the users to be grouped in an array as the entry's element. So 'long' will be an array listing all the names. Can this be done? At the moment I just append the names and when the transcript date and room changes I echo them retrospectively, but I will do the same for files and highlighted messages and it's messy. Thanks.

Read the article
pattern for the following condition in java

- by zahir hussain

hi i want to know how to write pattern.. for example : the word is "AboutGoogle AdWords Drive traffic and customers to your site. Pay through Cheque, Net Banking or Credit Card. Google Toolbar Add a search box to your browser. Google SMS To find out local information simply SMS to 54664. Gmail Free email with 7.2GB storage and less spam. Try Gmail today. Our ProductsHelp Help with Google Search, Services and ProductsGoogle Web Search Features Translation, I'm Feeling Lucky, CachedGoogle Services & Tools Toolbar, Google Web APIs, ButtonsGoogle Labs Ideas, Demos, ExperimentsFor Site OwnersAdvertising AdWords, AdSenseBusiness Solutions Google Search Appliance, Google Mini, WebSearchWebmaster Central One-stop shop for comprehensive info about how Google crawls and indexes websitesSubmit your content to Google Add your site, Google SitemapsOur CompanyPress Center News, Images, ZeitgeistJobs at Google Openings, Perks, CultureCorporate Info Company overview, Philosophy, Diversity, AddressesInvestor Relations Financial info, Corporate governanceMore GoogleContact Us FAQs, Feedback, NewsletterGoogle Logos Official Logos, Holiday Logos, Fan LogosGoogle Blog Insights to Google products and cultureGoogle Store Pens, Shirts, Lava lamps©2010 Google - Privacy Policy - Terms of Service" I have to search some word... for example "google insights" so how to write the code in java... i just write small code... check my code and answer my question... that code only use for find the search word, where is that. but i need to display some words front of search word and display some words rear of search workd... similar to google search... my code is Pattern p = Pattern.compile("(?i)(.*?)"+search+""); Matcher m = p.matcher(full); String title=""; while (m.find() == true) { title=m.group(1); System.out.println(title); } the full is orignal content, search s search word... thanks and advance

Read the article
Building Stored Procedure to group data into ranges with roughly equal results in each bucket

- by Len

I am trying to build one procedure to take a large amount of data and create 5 range buckets to display the data. the buckets ranges will have to be set according to the results. Here is my existing SP GO /****** Object: StoredProcedure [dbo].[sp_GetRangeCounts] Script Date: 03/28/2010 19:50:45 ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO ALTER PROCEDURE [dbo].[sp_GetRangeCounts] @idMenu int AS declare @myMin decimal(19,2), @myMax decimal(19,2), @myDif decimal(19,2), @range1 decimal(19,2), @range2 decimal(19,2), @range3 decimal(19,2), @range4 decimal(19,2), @range5 decimal(19,2), @range6 decimal(19,2) SELECT @myMin=Min(modelpropvalue), @myMax=Max(modelpropvalue) FROM xmodelpropertyvalues where modelPropUnitDescriptionID=@idMenu set @myDif=(@myMax-@myMin)/5 set @range1=@myMin set @range2=@myMin+@myDif set @range3=@range2+@myDif set @range4=@range3+@myDif set @range5=@range4+@myDif set @range6=@range5+@myDif select @myMin,@myMax,@myDif,@range1,@range2,@range3,@range4,@range5,@range6 select t.range as myRange, count(*) as myCount from ( select case when modelpropvalue between @range1 and @range2 then 'range1' when modelpropvalue between @range2 and @range3 then 'range2' when modelpropvalue between @range3 and @range4 then 'range3' when modelpropvalue between @range4 and @range5 then 'range4' when modelpropvalue between @range5 and @range6 then 'range5' end as range from xmodelpropertyvalues where modelpropunitDescriptionID=@idmenu) t group by t.range order by t.range This calculates the min and max value from my table, works out the difference between the two and creates 5 buckets. The problem is that if there are a small amount of very high (or very low) values then the buckets will appear very distorted - as in these results... range1 2806 range2 296 range3 75 range5 1 Basically I want to rebuild the SP so it creates buckets with equal amounts of results in each. I have played around with some of the following approaches without quite nailing it... SELECT modelpropvalue, NTILE(5) OVER (ORDER BY modelpropvalue) FROM xmodelpropertyvalues - this creates a new column with either 1,2,3,4 or 5 in it ROW_NUMBER()OVER (ORDER BY modelpropvalue) between @range1 and @range2 ROW_NUMBER()OVER (ORDER BY modelpropvalue) between @range2 and @range3 or maybe i could allocate every record a row number then divide into ranges from this?

Read the article
Match subpatterns in any order

- by Yaroslav

I have long regexp with two complicated subpatters inside. How i can match that subpatterns in any order? Simplified example: /(apple)?\s?(banana)?\s?(orange)?\s?(kiwi)?/ and i want to match both of apple banana orange kiwi apple orange banana kiwi It is very simplified example. In my case banana and orange is long complicated subpatterns and i don't want to do something like /(apple)?\s?((banana)?\s?(orange)?|(orange)?\s?(banana)?)\s?(kiwi)?/ Is it possible to group subpatterns like chars in character class? UPD Real data as requested: 14:24 26,37 Mb 108.53 01:19:02 06.07 24.39 19:39 46:00 my strings much longer, but it is significant part. Here you can see two lines what i need to match. First has two values: length (14 min 24 sec) and size 26.37 Mb. Second one has three values but in different order: size 108.53 Mb, length 01 h 19 m 02 s and date June, 07 Third one has two size and length Fourth has only length There are couple more variations and i need to parse all values. I have a regexp that pretty close except i can't figure out how to match patterns in different order without writing it twice. (?<size>\d{1,3}\[.,]\d{1,2}\s+(?:Mb)?)?\s? (?<length>(?:(?:01:)?\d{1,2}:\d{2}))?\s* (?<date>\d{2}\.\d{2}))? NOTE: that is only part of big regexp that forks fine already.

Read the article
Group panel of WPF ListBox

- by rulestein

I have a listbox that is grouping the items with a GroupStyle. I would like add a control at the bottom of the stackpanel that holds all of the groups. This additional control needs to be part of the scrolling content so that the user would scroll to the bottom of the list to see the control. If I were using a listbox without the groups, this task would be easy by modifying the ListBox template. However, with the items grouped, the ListBox template seems to only apply on a per group basis. I can modify the GroupStyle.Panel, but that doesn't allow me to add items to that panel. <ListBox> <ListBox.ItemsPanel> <ItemsPanelTemplate> <WrapPanel/> </ItemsPanelTemplate> </ListBox.ItemsPanel> <ListBox.GroupStyle> <GroupStyle> <GroupStyle.Panel> <ItemsPanelTemplate> <VirtualizingStackPanel/> **<----- I would like to add a control to this stackpanel** </ItemsPanelTemplate> </GroupStyle.Panel> <GroupStyle.ContainerStyle> <Style TargetType="{x:Type GroupItem}"> <Setter Property="Template"> <Setter.Value> <ControlTemplate TargetType="{x:Type GroupItem}"> <Grid> <ItemsPresenter /> </Grid> </ControlTemplate> </Setter.Value> </Setter> </Style> </GroupStyle.ContainerStyle> </GroupStyle> </ListBox.GroupStyle> This should give an idea of what I need to do:

Read the article
Eliminate full table scan due to BETWEEN (and GROUP BY)

- by Dave Jarvis

Description According to the explain command, there is a range that is causing a query to perform a full table scan (160k rows). How do I keep the range condition and reduce the scanning? I expect the culprit to be: Y.YEAR BETWEEN 1900 AND 2009 AND Code Here is the code that has the range condition (the STATION_DISTRICT is likely superfluous). SELECT COUNT(1) as MEASUREMENTS, AVG(D.AMOUNT) as AMOUNT, Y.YEAR as YEAR, MAKEDATE(Y.YEAR,1) as AMOUNT_DATE FROM CITY C, STATION S, STATION_DISTRICT SD, YEAR_REF Y FORCE INDEX(YEAR_IDX), MONTH_REF M, DAILY D WHERE -- For a specific city ... -- C.ID = 10663 AND -- Find all the stations within a specific unit radius ... -- 6371.009 * SQRT( POW(RADIANS(C.LATITUDE_DECIMAL - S.LATITUDE_DECIMAL), 2) + (COS(RADIANS(C.LATITUDE_DECIMAL + S.LATITUDE_DECIMAL) / 2) * POW(RADIANS(C.LONGITUDE_DECIMAL - S.LONGITUDE_DECIMAL), 2)) ) <= 50 AND -- Get the station district identification for the matching station. -- S.STATION_DISTRICT_ID = SD.ID AND -- Gather all known years for that station ... -- Y.STATION_DISTRICT_ID = SD.ID AND -- The data before 1900 is shaky; insufficient after 2009. -- Y.YEAR BETWEEN 1900 AND 2009 AND -- Filtered by all known months ... -- M.YEAR_REF_ID = Y.ID AND -- Whittled down by category ... -- M.CATEGORY_ID = '003' AND -- Into the valid daily climate data. -- M.ID = D.MONTH_REF_ID AND D.DAILY_FLAG_ID <> 'M' GROUP BY Y.YEAR Update The SQL is performing a full table scan, which results in MySQL performing a "copy to tmp table", as shown here: +----+-------------+-------+--------+-----------------------------------+--------------+---------+-------------------------------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+-----------------------------------+--------------+---------+-------------------------------+--------+-------------+ | 1 | SIMPLE | C | const | PRIMARY | PRIMARY | 4 | const | 1 | | | 1 | SIMPLE | Y | range | YEAR_IDX | YEAR_IDX | 4 | NULL | 160422 | Using where | | 1 | SIMPLE | SD | eq_ref | PRIMARY | PRIMARY | 4 | climate.Y.STATION_DISTRICT_ID | 1 | Using index | | 1 | SIMPLE | S | eq_ref | PRIMARY | PRIMARY | 4 | climate.SD.ID | 1 | Using where | | 1 | SIMPLE | M | ref | PRIMARY,YEAR_REF_IDX,CATEGORY_IDX | YEAR_REF_IDX | 8 | climate.Y.ID | 54 | Using where | | 1 | SIMPLE | D | ref | INDEX | INDEX | 8 | climate.M.ID | 11 | Using where | +----+-------------+-------+--------+-----------------------------------+--------------+---------+-------------------------------+--------+-------------+ Related http://dev.mysql.com/doc/refman/5.0/en/how-to-avoid-table-scan.html http://dev.mysql.com/doc/refman/5.0/en/where-optimizations.html http://stackoverflow.com/questions/557425/optimize-sql-that-uses-between-clause Thank you!

Read the article
Html.RadioButton group defaulting to 0

- by awrigley

Hi A default value of 0 is creeping into a Surveys app I am developing, for no reason that I can see. The problem is as follows: I have a group of Html.RadioButtons that represent the possible values a user can choose to answer a survey question (1 == Not at all, 2 == A little, 3 == A lot). I have used a tinyint datatype, that does not allow null, to store the answer to each question. The view code looks like this: <ol class="SurveyQuestions"> <% foreach (SurveyQuestion question in Model.Questions) { string col = question.QuestionColumn; %> <li><%=question.QuestionText%> <ul style="float:right;" class="MultiChoice"> <li><%= Html.RadioButton(col, "1")%></li> <li><%= Html.RadioButton(col, "2")%></li> <li><%= Html.RadioButton(col, "3")%></li> </ul> <%= Html.ValidationMessage(col, "*") %> </li> <% } %> </ol> [Note on the above code:] Each survey has about 70 questions, so I have put the questions text in one table, and store the results in a different table. I have put the Questions into my form view model (hence Model.Questions); the questions table has a field called QuestionColumn that allows me to link up the answer table column to the question, as shown above (<%= Html.RadioButton(col, "1")%, etc) [/Note] However, when the user DOESN'T answer the question, the value 0 is getting inserted into the database column. As a result, I don't get what I expect, ie, a validation error. In no place have I stipulated a default value of 0 for the fields in the answers table. So what is happening? Any ideas?

Read the article
Can't store UTF-8 in RDS despite setting up new Parameter Group using Rails on Heroku

- by Lail

I'm setting up a new instance of a Rails(2.3.5) app on Heroku using Amazon RDS as the database. I'd like to use UTF-8 for everything. Since RDS isn't UTF-8 by default, I set up a new Parameter Group and switched the database to use that one, basically per this. Seems to have worked: SHOW VARIABLES LIKE '%character%'; character_set_client utf8 character_set_connection utf8 character_set_database utf8 character_set_filesystem binary character_set_results utf8 character_set_server utf8 character_set_system utf8 character_sets_dir /rdsdbbin/mysql-5.1.50.R3/share/mysql/charsets/ Furthermore, I've successfully setup Heroku to use the RDS database. After rake db:migrate, everything looks good: CREATE TABLE `comments` ( `id` int(11) NOT NULL AUTO_INCREMENT, `commentable_id` int(11) DEFAULT NULL, `parent_id` int(11) DEFAULT NULL, `content` text COLLATE utf8_unicode_ci, `child_count` int(11) DEFAULT '0', `created_at` datetime DEFAULT NULL, `updated_at` datetime DEFAULT NULL, PRIMARY KEY (`id`), KEY `commentable_id` (`commentable_id`), KEY `index_comments_on_community_id` (`community_id`), KEY `parent_id` (`parent_id`) ) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; In the markup, I've included: <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> Also, I've set: production: encoding: utf8 collation: utf8_general_ci ...in the database.yml, though I'm not very confident that anything is being done to honor any of those settings in this case, as Heroku seems to be doing its own config when connecting to RDS. Now, I enter a comment through the form in the app: "Úbe® ƒåiL", but in the database I've got "ÃšbeÂ® Æ’Ã¥iL" It looks fine when Rails loads it back out of the database and it is rendered to the page, so whatever it is doing one way, it's undoing the other way. If I look at the RDS database in Sequel Pro, it looks fine if I set the encoding to "UTF-8 Unicode via Latin 1". So it seems Latin-1 is sneaking in there somewhere. Somebody must have done this before, right? What am I missing?

Read the article
complex regular expression task

- by Don Don

Hi, What regular expressions do I need to extract section title(s) in a text file? So, in the following sample text, I'd like to extract "Communication and Leadership" "1.Self-Knowledge" "2. Humility" "(3) Clear Thinking". Many thanks. Communication and Leadership True leaders understand that, rather than forcing their followers into a preconceived mold, their job is to motivate and organize followers to collectively accomplish goals that are in everyone's interests. The ability to communicate this to co-workers and followers is critical to the effectiveness of leadership. 1.Self-Knowledge Superior leaders are able to devote their skills and energies to leadership of a group because they have worked through personal issues to the point where they know themselves thoroughly. A high level of self-knowledge is a prerequisite to effective communication skills, because the things that you communicate as a leader are coming from within. 2. Humility This subversion of personal preference requires a certain level of humility. Although popular definitions of leaders do not always see them as humble, the most effective leaders actually are. This humility may not be expressed in self-effacement, but in a total commitment to the goals of the organization. Humility requires an understanding of one's own relative unimportance in comparison to larger systems. (3) Clear Thinking Clarity of thinking translates into clarity of communication. A leader whose goals or personal analysis is muddled will tend to deliver unclear or ambiguous directions to followers, leading to confusion and dissatisfaction. A leader with a clear mind who is not ambivalent about her purposes will communicate what needs to be done in a s traightforward and unmistakable manner.

Read the article
Need a set based solution to group rows

- by KM

I need to group a set of rows based on the Category column, and also limit the combined rows based on the SUM(Number) column to be less than or equal to the @Limit value. For each distinct Category column I need to identify "buckets" that are <=@limit. If the SUM(Number) of all the rows for a Category column are <=@Limit then there will be only 1 bucket for that Category value (like 'CCCC' in the sample data). However if the SUM(Number)@limit, then there will be multiple bucket rows for that Category value (like 'AAAA' in the sample data), and each bucket must be <=@Limit. There can be as many buckets as necessary. Also, look at Category value 'DDDD', its one row is greater than @Limit all by itself, and gets split into two rows in the result set. Given this simplified data: DECLARE @Detail table (DetailID int primary key, Category char(4), Number int) SET NOCOUNT ON INSERT @Detail VALUES ( 1, 'AAAA',100) INSERT @Detail VALUES ( 2, 'AAAA', 50) INSERT @Detail VALUES ( 3, 'AAAA',300) INSERT @Detail VALUES ( 4, 'AAAA',200) INSERT @Detail VALUES ( 5, 'BBBB',500) INSERT @Detail VALUES ( 6, 'CCCC',200) INSERT @Detail VALUES ( 7, 'CCCC',100) INSERT @Detail VALUES ( 8, 'CCCC', 50) INSERT @Detail VALUES ( 9, 'DDDD',800) INSERT @Detail VALUES (10, 'EEEE',100) SET NOCOUNT OFF DECLARE @Limit int SET @Limit=500 I need one of these result set: DetailID Bucket | DetailID Category Bucket -------- ------ | -------- -------- ------ 1 1 | 1 'AAAA' 1 2 1 | 2 'AAAA' 1 3 1 | 3 'AAAA' 1 4 2 | 4 'AAAA' 2 5 3 OR 5 'BBBB' 1 6 4 | 6 'CCCC' 1 7 4 | 7 'CCCC' 1 8 4 | 8 'CCCC' 1 9 5 | 9 'DDDD' 1 9 6 | 9 'DDDD' 2 10 7 | 10 'EEEE' 1

Read the article
Speeding up a group by date query on a big table in postgres

- by zaius

I've got a table with around 20 million rows. For arguments sake, lets say there are two columns in the table - an id and a timestamp. I'm trying to get a count of the number of items per day. Here's what I have at the moment. SELECT DATE(timestamp) AS day, COUNT(*) FROM actions WHERE DATE(timestamp) >= '20100101' AND DATE(timestamp) < '20110101' GROUP BY day; Without any indices, this takes about a 30s to run on my machine. Here's the explain analyze output: GroupAggregate (cost=675462.78..676813.42 rows=46532 width=8) (actual time=24467.404..32417.643 rows=346 loops=1) -> Sort (cost=675462.78..675680.34 rows=87021 width=8) (actual time=24466.730..29071.438 rows=17321121 loops=1) Sort Key: (date("timestamp")) Sort Method: external merge Disk: 372496kB -> Seq Scan on actions (cost=0.00..667133.11 rows=87021 width=8) (actual time=1.981..12368.186 rows=17321121 loops=1) Filter: ((date("timestamp") >= '2010-01-01'::date) AND (date("timestamp") < '2011-01-01'::date)) Total runtime: 32447.762 ms Since I'm seeing a sequential scan, I tried to index on the date aggregate CREATE INDEX ON actions (DATE(timestamp)); Which cuts the speed by about 50%. HashAggregate (cost=796710.64..796716.19 rows=370 width=8) (actual time=17038.503..17038.590 rows=346 loops=1) -> Seq Scan on actions (cost=0.00..710202.27 rows=17301674 width=8) (actual time=1.745..12080.877 rows=17321121 loops=1) Filter: ((date("timestamp") >= '2010-01-01'::date) AND (date("timestamp") < '2011-01-01'::date)) Total runtime: 17038.663 ms I'm new to this whole query-optimization business, and I have no idea what to do next. Any clues how I could get this query running faster?

Read the article
Parsing HTML using HtmlParser

- by Blankman

My html has 20 or so rows of the following HTML pattern. So the below is considered a single instance of the pattern. Each instance of this pattern represents a product. Again the below is a single instance, it spans multiple rows in the HTML table. <table> ..  <tr> <td rowspan="5" class="product" valign="top"><nobr> ????????????</td> </tr> <tr> <td class="title" ??????????>?????????</td> <td class="title" ??????????>?????????</td> <td class="title" ??????????>?????????</td> <td class="title" ??????????>?????????</td> <td class="title" ??????????>?????????</td> <td class="title" ??????????>?????????</td> </tr> <tr> <td class="data" ?????? </td> <td class="data" ?????? </td> <td class="data" ?????? </td> <td class="data" ?????? </td> <td class="data" ?????? </td> <td class="data" ?????? </td> </tr> </tr> <tr> <td colspan="5" ????????</td> </tr> <tr> <td colspan="6" width="100%"> <hr></td> </tr>   .. <table> I am trying to use HtmlParser for this. Parser rowParser = new Parser(); rowParser.setInputHtml(page.getHtml()); // page object represents a html page rowParser.setEncoding("UTF-8"); NodeFilter productRowFilter = new AndFilter( new TagNameFilter("tr"), new HasChildFilter( new AndFilter( new TagNameFilter("td"), new HasAttributeFilter("class", "product"))) The above filter doesn't work, just showing you what I have so far. I need to somehow combine these filters, and use the last td to mark the end of the pattern i.e. the td with the colspan=6 and width=100% with child element hr. I have been struggling with this, and have resorted to Regex'ing but was told numerous times to NOT use regex for html parsing, so here I am! Your help is much appreciated!

Read the article
php/dos : How do you parse a regedit export file?

- by phill

My objective is to look for Company key-value in the registry hive and then pull the corresponding Guid and other keys and values following it. So I figured i would run the regedit export command and then parse the file with php for the keys I need. So after running the dos batch command >regedit /E "output.txt" "HKLM\System....\Company1" The output textfile seems to be in some kind of UNICODE format which isn't regex friendly. I'm using php to parse the file and pull the keys. Here is the php code i'm using to parse the file <?php $regfile = "output.txt"; $handle = fopen ("c:\\\\" . $regfile,"r"); //echo "handle: " . $file . "<br>"; $row = 1; while ((($data = fgets($handle, 1024)) !== FALSE) ) { $num = count($data); echo "$num fields in line $row: \n"; $reg_section = $data; //$reg_section = "[HKEY_LOCAL_MACHINE\SOFTWARE\TECHNOLOGIES\MEDIUS\CONFIG MANAGER\SYSTEM\COMPANIES\RECORD11]"; $pattern = "/^(\[HKEY_LOCAL_MACHINE\\\SOFTWARE\\\TECHNOLOGIES\\\MEDIUS\\\CONFIG MANAGER\\\SYSTEM\\\COMPANIES\\\RECORD(\d+)\])$/"; if ( preg_match($pattern, $reg_section )) { echo "<font color=red>Found</font><br>"; } else { echo "not found<br>"; echo $data . "<br>"; } $row++; } //end while fclose($handle); ?> and the output looks like this.... 1 fields in line 1: not found ÿþW?i?n?d?o?w?s? ?R?e?g?i?s?t?r?y? ?E?d?i?t?o?r? ?V?e?r?s?i?o?n? ?5?.?0?0? ? 1 fields in line 2: not found 1 fields in line 3: not found [?H?K?E?Y??L?O?C?A?L??M?A?C?H?I?N?E?\?S?O?F?T?W?A?R?E?\?I?N?T?E?R?S?T?A?R? ?T?E?C?H?N?O?L?O?G?I?E?S?\?X?M?E?D?I?U?S?\?C?O?N?F?I?G? ?M?A?N?A?G?E?R?\?S?Y?S?T?E?M?\?C?O?M?P?A?N?I?E?S?]? ? 1 fields in line 4: not found "?N?e?x?t? ?R?e?c?o?r?d? ?I?D?"?=?"?4?1?"? ? 1 fields in line 5: not found Any ideas how to approach this? thanks in advance

Read the article
How do I create something like a negated character class with a string instead of characters?

- by Chas. Owens

I am trying to write a tokenizer for Mustache in Perl. I can easily handle most of the tokens like this: #!/usr/bin/perl use strict; use warnings; my $comment = qr/ \G \{\{ ! (?<comment> .+? ) }} /xs; my $variable = qr/ \G \{\{ (?<variable> .+? ) }} /xs; my $text = qr/ \G (?<text> .+? ) (?= \{\{ | \z ) /xs; my $tokens = qr/ $comment | $variable | $text /x; my $s = do { local $/; <DATA> }; while ($s =~ /$tokens/g) { my ($type) = keys %+; (my $contents = $+{$type}) =~ s/\n/\\n/; print "type [$type] contents [$contents]\n"; } __DATA__ {{!this is a comment}} Hi {{name}}, I like {{thing}}. But I am running into trouble with the Set Delimiters directive: #!/usr/bin/perl use strict; use warnings; my $delimiters = qr/ \G \{\{ (?<start> .+? ) = [ ] = (?<end> .+?) }} /xs; my $comment = qr/ \G \{\{ ! (?<comment> .+? ) }} /xs; my $variable = qr/ \G \{\{ (?<variable> .+? ) }} /xs; my $text = qr/ \G (?<text> .+? ) (?= \{\{ | \z ) /xs; my $tokens = qr/ $comment | $delimiters | $variable | $text /x; my $s = do { local $/; <DATA> }; while ($s =~ /$tokens/g) { for my $type (keys %+) { (my $contents = $+{$type}) =~ s/\n/\\n/; print "type [$type] contents [$contents]\n"; } } __DATA__ {{!this is a comment}} Hi {{name}}, I like {{thing}}. {{(= =)}} If I change it to my $delimiters = qr/ \G \{\{ (?<start> [^{]+? ) = [ ] = (?<end> .+?) }} /xs; It works fine, but the point of the Set Delimiters directive is to change the delimiters, so the code will wind up looking like my $variable = qr/ \G $start (?<variable> .+? ) $end /xs; And it is perfectly valid to say {{{== ==}}} (i.e. change the delimiters to {= and =}). What I want, but maybe not what I need, is the ability to say something like (?:not starting string)+?. I figure I am just going to have to give up being clean about it and drop code into the regex to force it to match only what I want. I am trying to avoid that for four reasons: I don't think it is very clean. It is marked as experimental. I am not very familier with it (I think it comes down to (?{CODE}) and returning special values. I am hoping someone knows some other exotic feature that I am not familiar with that fits the situation better (e.g. (?(condition)yes-pattern|no-pattern)). Just to make things clear (I hope), I am trying to match a constant length starting delimiter followed by the shortest string that allows a match and does not contain the starting delimiter followed by a space followed by an equals sign followed by the shortest string that allows a match that ends with the ending delimiter.

Read the article
SQL query mixing aggregated results and single values

- by Paul Flowerdew

I have a table with transactions. Each transaction has a transaction ID, and accounting period (AP), and a posting value (PV), as well as other fields. Some of the IDs are duplicated, usually because the transaction was done in error. To give an example, part of the table might look like: ID PV AP 123 100 2 123 -100 5 In this case the transaction was added in AP2 then removed in AP5. Another example would be: ID PV AP 456 100 2 456 -100 5 456 100 8 In the first example, the problem is that if I am analyzing what was spent in AP2, there is a transaction in there which actually shouldn't be taken into account because it was taken out again in AP5. In the second example, the second two transactions shouldn't be taken into account because they cancel each other out. I want to label as many transactions as possible which shouldn't be taken into account as erroneous. To identify these transactions, I want to find the ones with duplicate IDs whose PVs sum to zero (like ID 123 above) or transactions where the PV of the earliest one is equal to sum(PV), as in the second example. This second condition is what is causing me grief. So far I have SELECT * FROM table WHERE table.ID IN (SELECT table.ID FROM table GROUP BY table.ID HAVING COUNT(*) > 1 AND (SUM(table.PV) = 0 OR SUM(table.PV) = <PV of first transaction in each group>)) ORDER BY table.ID; The bit in chevrons is what I'm trying to do and I'm stuck. Can I do it like this or is there some other method I can use in SQL to do this? Edit 1: Btw I forgot to say that I'm using SQL Compact 3.5, in case it matters. Edit 2: I think the code snippet above is a bit misleading. I still want to mark out transactions with duplicate IDs where sum(PV) = 0, as in the first example. But where the PV of the earliest transaction = sum(PV), as in the second example, what I actually want is to keep the earliest transaction and mark out all the others with the same ID. Sorry if that caused confusion. Edit 3: I've been playing with Clodoaldo's solution and have made some progress, but still can't get quite what I want. I'm trying to get the transactions I know for certain to be erroneous. Suppose the following transactions are also in the table: ID PV AP 789 100 2 789 200 5 789 -100 8 In this example sum(PV) < 0 and the earliest PV < sum(PV) so I don't want to mark any of these out. If I modify Clodoaldo's query as follows: select t.* from t left join ( select id, min(ap) as ap, sum(pv) as sum_pv from t group by id having sum(pv) <> 0 ) s on t.id = s.id and t.ap = s.ap and t.pv = s.sum_pv where s.id is null This gives the result ID PV AP 123 100 2 123 -100 5 456 -100 5 456 100 8 789 100 3 789 200 5 789 -100 8 Whilst the first 4 transactions are ok (they would be marked out), the 789 transactions are also there, and I don't want them. But I can't figure out how to modify the query so that they're not included. Any ideas?

Read the article
LINQ to XML : A query body must end with a select clause or a group clause

- by Josh

Can someone guide me on to repairing the error on this query : var objApps = from item in xDoc.Descendants("VHost") where(from x in item.Descendants("Application")) select new clsApplication { ConnectionsTotal = item.Element("ConnectionsTotal").Value }; It displays a compiler error "A query body must end with a select clause or a group clause". Where am I going wrong? Would appreciate any help.. Thanks. Edit : Here is my XML(haven't closed the tags here)...I need the connectioncount values inside the Application.. - <Server> <ConnectionsCurrent>67</ConnectionsCurrent> <ConnectionsTotal>1424182</ConnectionsTotal> <ConnectionsTotalAccepted>1385091</ConnectionsTotalAccepted> <ConnectionsTotalRejected>39091</ConnectionsTotalRejected> <MessagesInBytesRate>410455.0</MessagesInBytesRate> <MessagesOutBytesRate>540146.0</MessagesOutBytesRate> - <VHost> <Name>_defaultVHost_</Name> <TimeRunning>5129615.178</TimeRunning> <ConnectionsLimit>0</ConnectionsLimit> <ConnectionsCurrent>67</ConnectionsCurrent> <ConnectionsTotal>1424182</ConnectionsTotal> <ConnectionsTotalAccepted>1385091</ConnectionsTotalAccepted> <ConnectionsTotalRejected>39091</ConnectionsTotalRejected> <MessagesInBytesRate>410455.0</MessagesInBytesRate> <MessagesOutBytesRate>540146.0</MessagesOutBytesRate> - <Application> <Name>TestApp</Name> <Status>loaded</Status> <TimeRunning>411642.953</TimeRunning> <ConnectionsCurrent>11</ConnectionsCurrent> <ConnectionsTotal>43777</ConnectionsTotal> <ConnectionsTotalAccepted>43135</ConnectionsTotalAccepted> <ConnectionsTotalRejected>642</ConnectionsTotalRejected> <MessagesInBytesRate>27876.0</MessagesInBytesRate> <MessagesOutBytesRate>175053.0</MessagesOutBytesRate>

Read the article
Complex SQL query with group by and two rows in one

- by Ricket

Okay, I need help. I'm usually pretty good at SQL queries but this one baffles me. By the way, this is not a homework assignment, it's a real situation in an Access database and I've written the requirements below myself. Here is my table layout. It's in Access 2007 if that matters; I'm writing the query using SQL. Id (primary key) PersonID (foreign key) EventDate NumberOfCredits SuperCredits (boolean) There are events that people go to. They can earn normal credits, or super credits, or both at one event. The SuperCredits column is true if the row represents a number of super credits earned at the event, or false if it represents normal credits. So for example, if there is an event which person 174 attends, and they earn 3 normal credits and 1 super credit at the event, the following two rows would be added to the table: ID PersonID EventDate NumberOfCredits SuperCredits 1 174 1/1/2010 3 false 2 174 1/1/2010 1 true It is also possible that the person could have done two separate things at the event, so there might be more than two columns for one event, and it might look like this: ID PersonID EventDate NumberOfCredits SuperCredits 1 174 1/1/2010 1 false 2 174 1/1/2010 2 false 3 174 1/1/2010 1 true Now we want to print out a report. Here will be the columns of the report: PersonID LastEventDate NumberOfNormalCredits NumberOfSuperCredits The report will have one row per person. The row will show the latest event that the person attended, and the normal and super credits that the person earned at that event. What I am asking of you is to write, or help me write, the SQL query to SELECT the data and GROUP BY and SUM() and whatnot. Or, let me know if this is for some reason not possible, and how to organize my data to make it possible. This is extremely confusing and I understand if you do not take the time to puzzle through it. I've tried to simplify it as much as possible, but definitely ask any questions if you give it a shot and need clarification. I'll be trying to figure it out but I'm having a real hard time with it, this is grouping beyond my experience...

Read the article
Optimize MySQL query (ngrams, COUNT(), GROUP BY, ORDER BY)

- by Gerardo

I have a database with thousands of companies and their locations. I have implemented n-grams to optimize search. I am making one query to retrieve all the companies that match with the search query and another one to get a list with their locations and the number of companies in each location. The query I am trying to optimize is the latter. Maybe the problem is this: Every company ('anunciante') has a field ('estado') to make logical deletes. So, if 'estado' equals 1, the company should be retrieved. When I run the EXPLAIN command, it shows that it goes through almost 40k rows, when the actual result (the reality matching companies) are 80. How can I optimize this? This is my query (XXX represent the n-grams for the search query): SELECT provincias.provincia AS provincia, provincias.id, COUNT(*) AS cantidad FROM anunciantes JOIN anunciante_invertido AS a_i0 ON anunciantes.id = a_i0.id_anunciante JOIN indice_invertido AS indice0 ON a_i0.id_invertido = indice0.id LEFT OUTER JOIN domicilios ON anunciantes.id = domicilios.id_anunciante LEFT OUTER JOIN localidades ON domicilios.id_localidad = localidades.id LEFT OUTER JOIN provincias ON provincias.id = localidades.id_provincia WHERE anunciantes.estado = 1 AND indice0.id IN (SELECT invertido_ngrama.id_palabra FROM invertido_ngrama JOIN ngrama ON ngrama.id = invertido_ngrama.id_ngrama WHERE ngrama.ngrama = 'XXX') AND indice0.id IN (SELECT invertido_ngrama.id_palabra FROM invertido_ngrama JOIN ngrama ON ngrama.id = invertido_ngrama.id_ngrama WHERE ngrama.ngrama = 'XXX') AND indice0.id IN (SELECT invertido_ngrama.id_palabra FROM invertido_ngrama JOIN ngrama ON ngrama.id = invertido_ngrama.id_ngrama WHERE ngrama.ngrama = 'XXX') AND indice0.id IN (SELECT invertido_ngrama.id_palabra FROM invertido_ngrama JOIN ngrama ON ngrama.id = invertido_ngrama.id_ngrama WHERE ngrama.ngrama = 'XXX') AND indice0.id IN (SELECT invertido_ngrama.id_palabra FROM invertido_ngrama JOIN ngrama ON ngrama.id = invertido_ngrama.id_ngrama WHERE ngrama.ngrama = 'XXX') GROUP BY provincias.id ORDER BY cantidad DESC And this is the query explained (hope it can be read in this format): id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY anunciantes ref PRIMARY,estado estado 1 const 36669 Using index; Using temporary; Using filesort 1 PRIMARY domicilios ref id_anunciante id_anunciante 4 db84771_viaempresas.anunciantes.id 1 1 PRIMARY localidades eq_ref PRIMARY PRIMARY 4 db84771_viaempresas.domicilios.id_localidad 1 1 PRIMARY provincias eq_ref PRIMARY PRIMARY 4 db84771_viaempresas.localidades.id_provincia 1 1 PRIMARY a_i0 ref PRIMARY,id_anunciante,id_invertido PRIMARY 4 db84771_viaempresas.anunciantes.id 1 Using where; Using index 1 PRIMARY indice0 eq_ref PRIMARY PRIMARY 4 db84771_viaempresas.a_i0.id_invertido 1 Using index 6 DEPENDENT SUBQUERY ngrama const PRIMARY,ngrama ngrama 5 const 1 Using index 6 DEPENDENT SUBQUERY invertido_ngrama eq_ref PRIMARY,id_palabra,id_ngrama PRIMARY 8 func,const 1 Using index 5 DEPENDENT SUBQUERY ngrama const PRIMARY,ngrama ngrama 5 const 1 Using index 5 DEPENDENT SUBQUERY invertido_ngrama eq_ref PRIMARY,id_palabra,id_ngrama PRIMARY 8 func,const 1 Using index 4 DEPENDENT SUBQUERY ngrama const PRIMARY,ngrama ngrama 5 const 1 Using index 4 DEPENDENT SUBQUERY invertido_ngrama eq_ref PRIMARY,id_palabra,id_ngrama PRIMARY 8 func,const 1 Using index 3 DEPENDENT SUBQUERY ngrama const PRIMARY,ngrama ngrama 5 const 1 Using index 3 DEPENDENT SUBQUERY invertido_ngrama eq_ref PRIMARY,id_palabra,id_ngrama PRIMARY 8 func,const 1 Using index 2 DEPENDENT SUBQUERY ngrama const PRIMARY,ngrama ngrama 5 const 1 Using index 2 DEPENDENT SUBQUERY invertido_ngrama eq_ref PRIMARY,id_palabra,id_ngrama PRIMARY 8 func,const 1 Using index

Read the article
Select latest group by in nhibernate

- by Kendrick

I have Canine and CanineHandler objects in my application. The CanineHandler object has a PersonID (which references a completely different database), an EffectiveDate (which specifies when a handler started with the canine), and a FK reference to the Canine (CanineID). Given a specific PersonID, I want to find all canines they're currently responsible for. The (simplified) query I'd use in SQL would be: Select Canine.* from Canine inner join CanineHandler on(CanineHandler.CanineID=Canine.CanineID) inner join (select CanineID,Max(EffectiveDate) MaxEffectiveDate from caninehandler group by CanineID) as CurrentHandler on(CurrentHandler.CanineID=CanineHandler.CanineID and CurrentHandler.MaxEffectiveDate=CanineHandler.EffectiveDate) where CanineHandler.HandlerPersonID=@PersonID Edit: Added mapping files below: <class name="CanineHandler" table="CanineHandler" schema="dbo"> <id name="CanineHandlerID" type="Int32"> <generator class="identity" /> </id> <property name="EffectiveDate" type="DateTime" precision="16" not-null="true" /> <property name="HandlerPersonID" type="Int64" precision="19" not-null="true" /> <many-to-one name="Canine" class="Canine" column="CanineID" not-null="true" access="field.camelcase-underscore" /> </class> <class name="Canine" table="Canine"> <id name="CanineID" type="Int32"> <generator class="identity" /> </id> <property name="Name" type="String" length="64" not-null="true" /> ... <set name="CanineHandlers" table="CanineHandler" inverse="true" order-by="EffectiveDate desc" cascade="save-update" access="field.camelcase-underscore"> <key column="CanineID" /> <one-to-many class="CanineHandler" /> </set> <property name="IsDeleted" type="Boolean" not-null="true" /> </class> I haven't tried yet, but I'm guessing I could do this in HQL. I haven't had to write anything in HQL yet, so I'll have to tackle that eventually anyway, but my question is whether/how I can do this sub-query with the criterion/subqueries objects. I got as far as creating the following detached criteria: DetachedCriteria effectiveHandlers = DetachedCriteria.For<Canine>() .SetProjection(Projections.ProjectionList() .Add(Projections.Max("EffectiveDate"),"MaxEffectiveDate") .Add(Projections.GroupProperty("CanineID"),"handledCanineID") ); but I can't figure out how to do the inner join. If I do this: Session.CreateCriteria<Canine>() .CreateCriteria("CanineHandler", "handler", NHibernate.SqlCommand.JoinType.InnerJoin) .List<Canine>(); I get an error "could not resolve property: CanineHandler of: OPS.CanineApp.Model.Canine". Obviously I'm missing something(s) but from the documentation I got the impression that should return a list of Canines that have handlers (possibly with duplicates). Until I can make this work, adding the subquery isn't going to work... I've found similar questions, such as http://stackoverflow.com/questions/747382/only-get-latest-results-using-nhibernate but none of the answers really seem to apply with the kind of direct result I'm looking for. Any help or suggestion is greatly appreciated.

Read the article
Help Converting T-SQL Query to LINQ Query

- by campbelt

I am new to LINQ, and so am struggle over some queries that I'm sure are pretty simple. In any case, I have been hiting my head against this for a while, but I'm stumped. Can anyone here help me convert this T-SQL query into a LINQ query? Once I see how it is done, I'm sure I'll have some question about the syntax: SELECT BlogTitle FROM Blogs b JOIN BlogComments bc ON b.BlogID = bc.BlogID WHERE b.Deleted = 0 AND b.Draft = 0 AND b.[Default] = 0 AND bc.Deleted = 0 GROUP BY BlogTitle ORDER BY MAX([bc].[Timestamp]) DESC Just to show that I have tried to solve this on my own, here is what I've come up with so far, though it doesn't compile, let alone work ... var iqueryable = from blog in db.Blogs join blogComment in db.BlogComments on blog.BlogID equals blogComment.BlogID where blog.Deleted == false && blog.Draft == false && blog.Default == false && blogComment.Deleted == false group blogComment by blog.BlogID into blogGroup orderby blogGroup.Max(blogComment => blogComment.Timestamp) select blogGroup;

Read the article
Optimizing a lot of Scanner.findWithinHorizon(pattern, 0) calls

- by darvids0n

I'm building a process which extracts data from 6 csv-style files and two poorly laid out .txt reports and builds output CSVs, and I'm fully aware that there's going to be some overhead searching through all that whitespace thousands of times, but I never anticipated converting about about 50,000 records would take 12 hours. Excerpt of my manual matching code (I know it's horrible that I use lists of tokens like that, but it was the best thing I could think of): public static String lookup(List<String> tokensBefore, List<String> tokensAfter) { String result = null; while(_match(tokensBefore)) { // block until all input is read if(id.hasNext()) { result = id.next(); // capture the next token that matches if(_matchImmediate(tokensAfter)) // try to match tokensAfter to this result return result; } else return null; // end of file; no match } return null; // no matches } private static boolean _match(List<String> tokens) { return _match(tokens, true); } private static boolean _match(List<String> tokens, boolean block) { if(tokens != null && !tokens.isEmpty()) { if(id.findWithinHorizon(tokens.get(0), 0) == null) return false; for(int i = 1; i <= tokens.size(); i++) { if (i == tokens.size()) { // matches all tokens return true; } else if(id.hasNext() && !id.next().matches(tokens.get(i))) { break; // break to blocking behaviour } } } else { return true; // empty list always matches } if(block) return _match(tokens); // loop until we find something or nothing else return false; // return after just one attempted match } private static boolean _matchImmediate(List<String> tokens) { if(tokens != null) { for(int i = 0; i <= tokens.size(); i++) { if (i == tokens.size()) { // matches all tokens return true; } else if(!id.hasNext() || !id.next().matches(tokens.get(i))) { return false; // doesn't match, or end of file } } return false; // we have some serious problems if this ever gets called } else { return true; // empty list always matches } } Basically wondering how I would work in an efficient string search (Boyer-Moore or similar). My Scanner id is scanning a java.util.String, figured buffering it to memory would reduce I/O since the search here is being performed thousands of times on a relatively small file. The performance increase compared to scanning a BufferedReader(FileReader(File)) was probably less than 1%, the process still looks to be taking a LONG time. I've also traced execution and the slowness of my overall conversion process is definitely between the first and last like of the lookup method. In fact, so much so that I ran a shortcut process to count the number of occurrences of various identifiers in the .csv-style files (I use 2 lookup methods, this is just one of them) and the process completed indexing approx 4 different identifiers for 50,000 records in less than a minute. Compared to 12 hours, that's instant. Some notes (updated): I don't necessarily need the pattern-matching behaviour, I only get the first field of a line of text so I need to match line breaks or use Scanner.nextLine(). All ID numbers I need start at position 0 of a line and run through til the first block of whitespace, after which is the name of the corresponding object. I would ideally want to return a String, not an int locating the line number or start position of the result, but if it's faster then it will still work just fine. If an int is being returned, however, then I would now have to seek to that line again just to get the ID; storing the ID of every line that is searched sounds like a way around that. Anything to help me out, even if it saves 1ms per search, will help, so all input is appreciated. Thankyou! Usage scenario 1: I have a list of objects in file A, who in the old-style system have an id number which is not in file A. It is, however, POSSIBLY in another csv-style file (file B) or possibly still in a .txt report (file C) which each also contain a bunch of other information which is not useful here, and so file B needs to be searched through for the object's full name (1 token since it would reside within the second column of any given line), and then the first column should be the ID number. If that doesn't work, we then have to split the search token by whitespace into separate tokens before doing a search of file C for those tokens as well. Generalised code: String field; for (/* each record in file A */) { /* construct the rest of this object from file A info */ // now to find the ID, if we can List<String> objectName = new ArrayList<String>(1); objectName.add(Pattern.quote(thisObject.fullName)); field = lookup(objectSearchToken, objectName); // search file B if(field == null) // not found in file B { lookupReset(false); // initialise scanner to check file C objectName.clear(); // not using the full name String[] tokens = thisObject.fullName.split(id.delimiter().pattern()); for(String s : tokens) objectName.add(Pattern.quote(s)); field = lookup(objectSearchToken, objectName); // search file C lookupReset(true); // back to file B } else { /* found it, file B specific processing here */ } if(field != null) // found it in B or C thisObject.ID = field; } The objectName tokens are all uppercase words with possible hyphens or apostrophes in them, separated by spaces. Much like a person's name. As per a comment, I will pre-compile the regex for my objectSearchToken, which is just [\r\n]+. What's ending up happening in file C is, every single line is being checked, even the 95% of lines which don't contain an ID number and object name at the start. Would it be quicker to use ^[\r\n]+.*(objectname) instead of two separate regexes? It may reduce the number of _match executions. The more general case of that would be, concatenate all tokensBefore with all tokensAfter, and put a .* in the middle. It would need to be matching backwards through the file though, otherwise it would match the correct line but with a huge .* block in the middle with lots of lines. The above situation could be resolved if I could get java.util.Scanner to return the token previous to the current one after a call to findWithinHorizon. I have another usage scenario. Will put it up asap.

Read the article
Tricky MySQL Query for messaging system in Rails - Please Help

- by ole_berlin

Hi, I'm writing a facebook style messaging system for a Rails App and I'm having trouble selecting the Messages for the inbox (with will_paginate). The messages are organized in threads, in the inbox the most recent message of a thread will appear with a link to it's thread. The thread is organized via a parent_id 1-n relationship with itself. So far I'm using something like this: class Message < ActiveRecord::Base belongs_to :sender, :class_name => 'User', :foreign_key => "sender_id" belongs_to :recipient, :class_name => 'User', :foreign_key => "recipient_id" has_many :children, :class_name => "Message", :foreign_key => "parent_id" belongs_to :thread, :class_name => "Message", :foreign_key => "parent_id" end class MessagesController < ApplicationController def inbox @messages = current_user.received_messages.paginate :page => params[:page], :per_page => 10, :order => "created_at DESC" end end That gives me all the messages, but for one thread the thread itself and the most recent message will appear (and not only the most recent message). I can also not use the GROUP BY clause, because for the thread itself (the parent so to say) the parent_id = nil of course. Anyone got an idea on how to solve this in an elegant way? I already thought about adding the parent_id to the parent itself and then group by parent_id, but I'm not sure if that works. Thanks

Read the article
SQL: How to Return One DB Row from Two That Have The Same Values In Opposite Columns Using the MAX F

- by OneSource

Hi, This is what I'm trying to do. I have three columns in a table - ID, Column1, Column2 - with this example data: ID Column1 Column2 1 1 2 2 2 1 3 4 3 4 3 4 Since, in the first two rows, Column1 and Column2 have the same values (but in different columns), I want my MAX query to return an ID of 2. Same thing with rows 3 and 4 .... since Columns 1 and 2 have the same values (but in different columns), I want MAX(ID) to return 4. Of course, with MAX, you use Group By, but that will not work in my case. In effect, I need a Group By to work across two columns. Is this possible? If not, what's the best way to accomplish getting the IDs of 2 and 4 given the matching values that are in different columns? Thanks!

Read the article

< Previous Page | 120 121 122 123 124 125 126 127 128 129 130 131 | Next Page >