what is the best way to split an existing Lucene index into two halves i.e. each split should contain half of the total number of documents in the original index
A<<B would mean circling the characters of A towards left by the length of the data in B . So I am using split not the loop . But I am not able to figure out how to split a string according to the length of B.
Hi I need to split list by an argument in Haskell. I found function like this
group :: Int -> [a] -> [[a]]
group _ [] = []
group n l
| n > 0 = (take n l) : (group n (drop n l))
| otherwise = error "Negative n"
But what if lists that I want to divide are contained by another list?
For example
group 3 [[1,2,3,4,5,6],[2,4,6,8,10,12]]
should return
[[[1,2,3],[4,5,6]],[[2,4,6],[8,10,12]]]
Is there any way to do that ?
Hi!
I'm wondering if is there a way to force MSSQL Management Studio to produce a script like this:
ALTER TABLE Mytable
ADD MyCol bit NOT NULL
CONSTRAINT MyColDefault
DEFAULT 0 WITH VALUES
ALTER TABLE [dbo].Mytable
ALTER COLUMN MyCol2 int NULL
GO
when I alter a very simple property of a column on a table.
If I do this in the designer and ask for the produced script, the script doesn't do such simple tasks, but instead copies all the data in a tmp table, drops the original table, renames the tmp table with the original table name. And, of course, drops and recreates every constraint and relationships.
Is there any option I can change to change this behaviour? Or, this may be possible, is there some danger I don't see in using the simple ALTER TABLE above?
thanks.
I'm using paperclip to upload a pdf. Once the file is uploaded I need to split every page into a png. This is the command I think I need to use
convert -size 640x300 fileName.pdf slide.png
Now if I run that command from terminal it works fine, but I need a way of getting each slides name so I can add it into a model.
What's the best way to achieve this?
I've got a table in a div, with a vertical scrollbar on the div to allow the table to be longer than the div can hold. Works fine. But I'd like to allow the user to resize the div vertically if they want to be able to view more of the table. I've been playing with the jQueryUI resizable interaction, but it doesn't seem to quite do what I want; at least, not so far. I've tried making the wrapper div resizable, but the behavior's erratic.
If I have the style "height:20em; overflow:auto;" on it, then I can resize the table horizontally, but not vertically. If I remove the overflow, then the table flows outside the div of course. If I remove the height, then the table is actually resizable, but it is initially drawn at full height. Anyone know of a way to specify an initial height, but allow it to be resized larger than that?
If I make the table resizable rather than the div, then I can resize the table horizontally within the div but I can't increase the height of the displayed table. Which makes sense, of course, but I thought I'd mention it.
Also, is there a way to make the resize "handle" the corner between the horizontal and vertical scrollbars? Right now it's a sort of invisible handle in the bottom-right of the table.
Thanks for any thoughts.
Suppose I have a set of objects, S. There is an algorithm f that, given a set S builds certain data structure D on it: f(S) = D. If S is large and/or contains vastly different objects, D becomes large, to the point of being unusable (i.e. not fitting in allotted memory). To overcome this, I split S into several non-intersecting subsets: S = S1 + S2 + ... + Sn and build Di for each subset. Using n structures is less efficient than using one, but at least this way I can fit into memory constraints. Since size of f(S) grows faster than S itself, combined size of Di is much less than size of D.
However, it is still desirable to reduce n, i.e. the number of subsets; or reduce the combined size of Di. For this, I need to split S in such a way that each Si contains "similar" objects, because then f will produce a smaller output structure if input objects are "similar enough" to each other.
The problems is that while "similarity" of objects in S and size of f(S) do correlate, there is no way to compute the latter other than just evaluating f(S), and f is not quite fast.
Algorithm I have currently is to iteratively add each next object from S into one of Si, so that this results in the least possible (at this stage) increase in combined Di size:
for x in S:
i = such i that
size(f(Si + {x})) - size(f(Si))
is min
Si = Si + {x}
This gives practically useful results, but certainly pretty far from optimum (i.e. the minimal possible combined size). Also, this is slow. To speed up somewhat, I compute size(f(Si + {x})) - size(f(Si)) only for those i where x is "similar enough" to objects already in Si.
Is there any standard approach to such kinds of problems?
I know of branch and bounds algorithm family, but it cannot be applied here because it would be prohibitively slow. My guess is that it is simply not possible to compute optimal distribution of S into Si in reasonable time. But is there some common iteratively improving algorithm?
Hi,
I'm writing a parser in Python. I've converted an input string into a list of tokens, such as:
['(', '2', '.', 'x', '.', '(', '3', '-', '1', ')', '+', '4', ')', '/', '3', '.', 'x', '^', '2']
I want to be able to split the list into multiple lists, like the str.split('+') function. But there doesn't seem to be a way to do my_list.split('+'). Any ideas?
Thanks!
I'm having a weird problem with index organized table. I'm running Oracle 11g standard.
i have a table src_table
SQL> desc src_table;
Name Null? Type
--------------- -------- ----------------------------
ID NOT NULL NUMBER(16)
HASH NOT NULL NUMBER(3)
........
SQL> select count(*) from src_table;
COUNT(*)
----------
21108244
now let's create another table and copy 2 columns from src_table
set timing on
SQL> create table dest_table(id number(16), hash number(20), type number(1));
Table created.
Elapsed: 00:00:00.01
SQL> insert /*+ APPEND */ into dest_table (id,hash,type) select id, hash, 1 from src_table;
21108244 rows created.
Elapsed: 00:00:15.25
SQL> ALTER TABLE dest_table ADD ( CONSTRAINT dest_table_pk PRIMARY KEY (HASH, id, TYPE));
Table altered.
Elapsed: 00:01:17.35
It took Oracle < 2 min.
now same exercise but with IOT table
SQL> CREATE TABLE dest_table_iot (
id NUMBER(16) NOT NULL,
hash NUMBER(20) NOT NULL,
type NUMBER(1) NOT NULL,
CONSTRAINT dest_table_iot_PK PRIMARY KEY (HASH, id, TYPE)
) ORGANIZATION INDEX;
Table created.
Elapsed: 00:00:00.03
SQL> INSERT /*+ APPEND */ INTO dest_table_iot (HASH,id,TYPE)
SELECT HASH, id, 1
FROM src_table;
"insert" into IOT takes 18 hours !!! I have tried it on 2 different instances of Oracle running on win and linux and got same results.
What is going on here ? Why is it taking so long ?
How do I split the UIPicker into multiple parts, like the date picker only not the Day, month, and year - my own specified variables such as - Gender and age?
Regards, SO
I am new to python and Perl. I have been trying to solve a simple problem and getting tied in knots with syntax. I hope someone has the time and patience to help.
I have a 25mb file in ".txt" format which contains news-wire articles going back to 1970. Each news story is concatenated to the next, with only the "Copyright" statement to delimit. Each news story starts with "Item XX of XXX DOCUMENTS". There are certain metadata that are repeated throughout, I will use these for tagging later on.
I wish to split this 25mb file into separate .txt files, each containing one news story (i.e. the text between "DOCUMENTS" and "Copyright", saving each with a different name (obviously).
I am trying to 1 ) open the file... 2) iterate over lines in the file checking for the eof delimiter, and if it is not present writing the line to a list 3)write that list to a seperate small file.
I'm having big problems with changing filenames using the counter, and how do I make Python start from where I left off, is the "seek" function appropriate?
so far I have been trying this approach, completely unsuccessfully:
myfile = open ("myfile.txt", 'r')
filenumber = 0
for line in myfile.readline():
filenumber += 1
w=0
while myfile.readline() != '\s+DOCUMENTS\s*\n'
### read my line into a list
mysmallfile()['w'] = [myfile.readline()]
w += 1
output = open('C:\\Users\\dunner7\\Documents\###how do I change the filename each iteration???', 'w')
output.writelines(mysmallfile)
###go back to start.
Thank you for your time and patience.
RD
I have a data set like below:
Country Region Molecule Item Code
IND NA PB102 FR206985511
THAI AP PB103 BA-107603 / F000113361 / 107603
LUXE NA PB105 1012701 / SGP-1012701 / F041701000
IND AP PB106 AU206985211 / CA-F206985211
THAI HP PB107 F034702000 / 1010701 / SGP-1010701
BANG NA PB108 F000007970/25781/20009021
I want to split based the string values in ITEMCODE column on / and create a new row for each entry.
For instance, the desired output will be:
Country Region Molecule Item Code New row
IND NA PB102 FR206985511 FR206985511
THAI AP PB103 BA-107603 / F000113361 / 107603 F000113361
107603
BA-107603
LUXE NA PB105 1012701 / SP-1012701 / F041701000 1012701
SP-1012701
F041701000
IND AP PB106 AU206985211 / CA-F206985211 AU206985211
CA-F206985211
THAI HP PB107 F034702000 / 1010701 / SP-1010701 F034702000
1010701
SP-1010701
BANG NA PB108 F000007970/25781/20009021 F000007970
25781
20009021
I tried the below code
library(splitstackshape)
df2=concat.split.multiple(df1,"Plant.Item.Code","/", direction="long")
but got the Error
"Error: memory exhausted (limit reached?)"
When i tried strsplit() i got the below error message.
Error in strsplit(df1$Plant.Item.Code, "/") : non-character argument
Any help from you will be appreciated.
Using InsertOnSubmit seems to have some memory overhead.
I have a System.Data.Linq.Table<User> table. When I do table.InsertOnSubmit(user) and then int count = table.Count(), the memory usage of my application increases by roughly the size of the User table, but the count is the number of items before user was inserted. So I'm guess an enumeration after InsertOnSubmit will create a copy of the table. Is that true?
hello,
i am reading some parameters(from user input) from a .txt file and want to make sure that my script could read it even a space or tab is left before that particular parameter by user.
also if i want to add a comment for each parameter followed by # , after the parameter (e.g 7870 #this is default port number) to let the user know about the parameter
how can i achieve it in same file ?
here is wat i am using (/\|\s/)
code::
$data_file="config.txt";
open(RAK, $data_file)|| die("Could not open file!");
@raw_data=;
@Ftp_Server =split(/\|\s/,$raw_data[32]);
config.txt (user input file)
PING_TTL | 1
CLIENT_PORT | 7870
FTP_SERVER | 192.162.522.222
could any body suggest me a robust way to do it?
/rocky
I have a datamodel that has an intermediate table to manage relationships between entities.
For example, tables Person and Organization are related through the Relationship table
Party (table)
- ID
Person (table)
- ID (references Party.ID)
- name
Organization (table)
-ID (references Party.ID)
-name
Relationship (table)
-ID (PK)
-type (references relationshiptype lookup)
-fromID (references Party.ID)
-ToID (references Party.ID)
-fromDate
-ToDate
Type+fromID+ToID+fromDate+ToDate is guaranteed to be unique.
How do I manage this using hibernate?
TIA
I would like to add index(s) to my table.
I am looking for general ideas how to add more indexes to a table.
Other than the PK clustered.
I would like to know what to look for when I am doing this.
So, my example:
This table (let's call it TASK table) is going to be the biggest table of the whole application. Expecting millions records.
IMPORTANT: massive bulk-insert is adding data in this tabletable has 27 columns: (so far, and counting :D )
int x 9 columns = id-s
varchar x 10 columns
bit x 2 columns
datetime x 5 columns
INT COLUMNS
all of these are INT ID-s but from tables that are usually smaller than Task table (10-50 records max), example: Status table (with values like "open", "closed") or Priority table (with values like "important", "not so important", "normal")
there is also a column like "parent-ID" (self - ID)
join: all the "small" tables have PK, the usual way ... clustered
STRING COLUMNS
there is a (Company) column (string!) that is something like "5 characters long all the time" and every user will be restricted using this one. If in Task there are 15 different "Companies" the logged in user would only see one. So there's always a filter on this one. Might be a good idea to add an index to this column?
DATE COLUMNS
I think they don't index these ... right? Or can / should be?
I have a file with a bunch of number in columns. These numbers are separated by variable number of spaces. I want to skip the first line and get all the other lines and separte each number on the line. Finally, I want to write each number on Excel. I've been able to get the lines and write them on Excel but I can't separate each number (I'm getting the whole line as one string).
Does any body know how to split a string that has a variable number of spaces?
Here is my code.
Sub Test()
r = 0
With New Scripting.FileSystemObject
With .OpenTextFile("C:\Users\User\Desktop\File.tab", ForReading)
If Not .AtEndOfStream Then .SkipLine
Do Until .AtEndOfStream
ActiveCell.Offset(r, 0) = Split(.ReadLine, vbCrLf)
r = r + 1
Loop
End With
End With
End Sub
Hello!
I have the following t-sql code which generates an error
Declare @table TABLE
(
ID1 int,
ID2 int
)
INSERT INTO @table values(1, 1);
INSERT INTO @table values(2, 2);
INSERT INTO @table values(3, 3);
DECLARE @field varchar(50);
SET @field = 'ID1'
DECLARE @query varchar(MAX);
SET @query = 'SELECT * FROM @table WHERE ' + @field + ' = 1'
EXEC (@query)
The error is Must declare the table variable "@table".
What's wrong with the query. How to fix it?
Edit: OK, I can't read, thanks to Col. Shrapnel for the help. If anyone comes here looking for the same thing to be answered...
print_r(preg_split('/([\!|\?|\.|\!\?])/', $string, null, PREG_SPLIT_DELIM_CAPTURE));
Is there any way to split a string on a set of delimiters, and retain the position and character(s) of the delimiter after the split?
For example, using delimiters of ! ? . !? turning this:
$string = 'Hello. A question? How strange! Maybe even surreal!? Who knows.';
into this
array('Hello', '.', 'A question', '?', 'How strange', '!', 'Maybe even surreal', '!?', 'Who knows', '.');
Currently I'm trying to use print_r(preg_split('/([\!|\?|\.|\!\?])/', $string)); to capture the delimiters as a subpattern, but I'm not having much luck.
I have a completed string like this
N:Pay in Cash++RGI:40++R:200++T:Purchase++IP:N++IS:N++PD:PC++UCP:598.80++UPP:0.00++TCP:598.80++TPP:0.00++QE:1++QS:1++CPC:USD++PPC:Points++D:Y++E:Y++IFE:Y++AD:Y++IR:++MV:++CP:~ ~N:ERedemption++RGI:42++R:200++T:Purchase++IP:N++IS:N++PD:PC++UCP:598.80++UPP:0.00++TCP:598.80++TPP:0.00++QE:1++QS:1++CPC:USD++PPC:Points++D:Y++E:Y++IFE:Y++AD:Y++IR:++MV:++CP:
this string is like this
It's list of PO's(Payment Options) which are separated by ~~
this list may contains one or more
PO contains only Key-Value Pairs which separated by :
spaces are denoted by ++
I need to extract the values for Key "RGI" and "N".
I can do it via for loop , I want a efficient way to do this.
any help on this.
I took the following code from the examples page on Asio
class tcp_connection : public boost::enable_shared_from_this<tcp_connection>
{
public:
typedef boost::shared_ptr<tcp_connection> pointer;
static pointer create(boost::asio::io_service& io_service)
{
return pointer(new tcp_connection(io_service));
}
tcp::socket& socket()
{
return socket_;
}
void start()
{
message_ = make_daytime_string();
boost::asio::async_write(socket_, boost::asio::buffer(message_),
boost::bind(&tcp_connection::handle_write, shared_from_this(),
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred));
}
private:
tcp_connection(boost::asio::io_service& io_service)
: socket_(io_service)
{
}
void handle_write(const boost::system::error_code& /*error*/,
size_t /*bytes_transferred*/)
{
}
tcp::socket socket_;
std::string message_;
};
I'm relatively new to C++ (from a C# background), and from what I understand, most people would split this into header and source files (declaration/implementation, respectively). Is there any reason I can't just leave it in the header file if I'm going to use it across many source files? If so, are there any tools that will automatically convert it to declaration/implementation for me? Can someone show me what this would look like split into header/source file for an example (or just part of it, anyway)? I get confused around weird stuff like thistypedef boost::shared_ptr<tcp_connection> pointer; Do I include this in the header or the source? Same with tcp::socket& socket()
I've read many tutorials, but this has always been something that has confused me about C++.
In one SQL Task can I create a table variable
DELCARE @TableVar TABLE (...)
Then in another SQL Task or DataSource destination and select or insert into the table variable?
The other option I have considered is using a Temp Table.
CREATE TABLE #TempTable (...)
I would prefer to use Table Variable so that it remains in memory. But can use temp table if it is not possible to use table variable. Also I cannot use the record set destination as I need to preform straight SQL tasks on it later on.
My specific concern is related to the performance of a clustered index on a reference table that has many rapid inserts and deletes.
Table 1 "Collection" collection_pk int (among other fields)
Table 2 "Item" item_pk int (among other fields)
Reference Table "Collection_Items" collection_pk int, item_pk int (combined primary key)
Because the primary key is composed of both pks, a clustered index is created and the data physically ordered in the table according to the combined keys.
I have many users creating and deleting collections and adding and removing items to those collections very frequently affecting the "Collection_Items" table, and its clustered index.
QUESTION PART: Since the "Collection_Items" table is so dynamic, wouldn't there be a big performance hit on constantly resorting the table rows because of the clustered index ?
If yes, what should I do to minimize this ?
On a webpage, is it possible to split large files into chunks before the file is uploaded to the server? For example, split a 10MB file into 1MB chunks, and upload one chunk at a time while showing a progress bar?
It sounds like JavaScript doesn't have any file manipulation abilities, but what about Flash and Java applets?
This would need to work in IE6+, Firefox and Chrome. Update: forgot to mention that (a) we are using Grails and (b) this needs to run over https.