Multisearch Log

These are the log files from 9 - 13 June 2008. For additional log files, please return to the main log page.

Friday, 13 June
The results for vsmstat did not finish until 10:30am, which is longer than I calculated. I've been reading about using Hadoop, since the plan is to integrate Lucene and Lemur through Hadoop software.

After the vsmstat was completed, I wanted to play with Dynamic Runs a bit, to see what kinds of errors they generated. Since Snowy has most of the backends, it is having a Gccount overhead limit error. I've realized there are a few options that can be persued:

Increase Snowy's heap.
Don't validate the backends during configuration
Don't load the backends before the top-50 are selected for the dynamic runs

Before, with the Restriction algorithms, all the backends would be loaded, validated, then limited down by an algorithm. Here, since we've already done the algorithm work, we can simply not load the backends at all until they're selected. That might reduce the memory error, but take slightly more time.

We're hoping to get all 5 runs done, which seems entirely possible given the Dynamic runs working correctly. I am going to have to cap the results at the top 1,000 documents, since Chris has to trim the results for the runs now (too much text).

The TREC submissions are due by Monday, but Chris says it takes a long time to upload them since they're large files. He's leaving on Sunday, so if the last run or two don't make it in I'll have to submit them.

Thursday, 12 June
I did some optimization (eg�not closing and reopening the output file each time) to the code, and it still runs between 3-5 seconds to search, but it prints faster, too. I started the first full static run (lsi150stat) at 11:40pm, and I expect it to be done by 9:40-10:40pm at the latest.

As the runs go by, I can see that more and more of the searches only take a 2 or so seconds, but some are in the realm of 7-12 seconds, so the original prediction of 3-5 seconds works.

By 10:45pm, the results were done (I am not sure of the official time stop) for lsi150stat! I started the next round of results around 11:00pm.

Wednesday, 11 June
After a lot of issues, all of the backends are up and running, in the realm of 990! The configuration file for the full-multisearch with all of the backends doesn't load very well, and Pileus and Snowy will hang if I try to run a query on all 990 backends.

I looked into the problem (re: Why is it that when I tell Tomcat to shutdown it won't most of the time?) and it seems that Servlets and threading don't go well together. The servlet, technically, does not generate any threads, but Multisearch's run() command uses threads to run searching (since a slow server might take twice as long to respond, and we don't want to wait for that). Maybe there is a safe way to detroy() the search threads to stop the hanging? Right now I have to kill the Tomcat processes if I want to restart.

As of 6pm, all of the code works for running the trial TREC searches. Dulcet went down again and I had to go home, and putty wouldn't stay connected long enough for me to run the search of 10,000 queries.

Right now, each query takes between 3-5 seconds to run, so we're looking at a 10-11 hour run for all 10,000 static topics. I'm going to start the run in the morning.

Tuesday, 10 June
After some set backs with the backends, I started working on completing some code to run the searches for TREC2008 on Multisearch. I have been testing it, but I want to make sure it scales well for running 10,000 queries.

I've sort of �cheated� a little for the static runs. Instead of opening the file with all the static (non-changing-by-query) scores for the servers, I simply generated a configMultisearch.xml file for the top 50 servers listed. I wanted to start the static runs while working on the new dynamic run code, and this helped me out.

I'm currently testing this against the trial topics (there are 100 of them) to ensure that the file printing is going well. I still am having some backend issues, but I've found that there is a good solution to this: by running one search over all listed backends (in the 50 servers generated) I can see which ones are having issues and which ones aren't. Some of them needed resources exposed, and others had some typos in their launch. I fixed them up, and hopefully that will be the last that I work with the backends (after I finish running results).

Monday, 9 June
Over the weekend, I worked on getting all the backends up and running on Multisearch. OGSA-DAI can load all the backends on the servers, but it takes a little longer (8+ seconds by the end of the runs) to add a new services or resource.

I only have 1 services on Nimbus, because it's almost already full. Pileus has a handful, but the bulk of the servers are on Snowy, which is a new server with tons of space. Multisearch with OGSA-DAI is a very good structure to those that know it, since it scaled well to loading over 1,000 backends (there were a few mistakes!) and working with large XML files.

I'm currently trouble shooting some runs for TREC2008, hoping to get that squared away for tomorrow. The backends are still giving me trouble with the servlet, so I am looking into why now...

Arctic Region Supercomputing Center
PO Box 756020, Fairbanks, AK 99775

© Arctic Region Supercomputing Center 2006-2008. This page was last updated on 9 June 2008.
These files are part of a portfolio for Kylie McCormick's online resume. See the disclaimer for more information.