Multisearch Log

These are the logs from the week of 11-15 August. To see more, view the main log page.

Friday, 15 August
I am working on running the larger data sets w/queries so I can run the shorter ones remotely. I'm also finishing up this week's to-do list:

Write up website entries
- ~~Documentation - including Javadocs~~
- ~~Multisearch~~
- ~~Installation~~
- ~~Backends~~
- ~~Limitations~~
- ~~Downloads~~
- ~~Papers~~
- ~~Site Index~~
Update TREC paper
Write MS 2008 Paper

Thursday, 14 August
Presentation given on Multisearch [PPT] [PDF]

Wednesday, 13 August
I'm trying to download files from Midnight. Chris made a few generated files with information about runs, which I would like to start using on Multisearch to gage its speed. I am doing the following runs (hopefully before I leave):

Merge Algorithm	Restriction Algorithm	Limit (Backends)
Naive	Random	10
Naive	Random	50
Naive	Random	100
Naive	None	n/a
Naive	Matrix	10
Naive	Matrix	50
Naive	Matrix	100
RankShuffle	Random	10
RankShuffle	Random	50
RankShuffle	Random	100
RankShuffle	None	n/a
RankShuffle	Matrix	10
RankShuffle	Matrix	50
RankShuffle	Matrix	100
LeapOfFaith	Random	10
LeapOfFaith	Random	50
LeapOfFaith	Random	100
LeapOfFaith	None	n/a
LeapOfFaith	Matrix	10
LeapOfFaith	Matrix	50
LeapOfFaith	Matrix	100

~~Restriction algorithms~~
Integrate C++ to Java for MatrixSelect
Write up website entries
- Documentation - including Javadocs
- Multisearch
- Installation
- Backends
- Limitations
- Downloads
- Papers
- Site Index
Update TREC paper
Write MS 2008 Paper
MS 2008 Presentation
~~Perhaps add a few new Lemur backends if some gov2 are missing~~

All of the backends are defined in allinput/, which is formally located at /home/mccormic/merge/tomcat/webapps/multisearch/WEB-INF/classes/allinput/

I've pruned through this, trying to remove redundant and/or faulty backends. Now there are about 1000 of them properly. Yay! I've also made note of the three longest runs -- the ones that will run over all backends. I might run these last (since they take so long) or first... I haven't decided yet.

Tuesday, 12 August
I've tried to run Hadoop/Multisearch with all the backends on Snowy, and the map() pans out fine, but not the merge. I want to try to change the addRanked() function of OrderedList to use binary search! Maybe that will speed things up!

Update: It seems to, definitely. I might also want to try adding quickly (to the end) and sorting at finish instead of insertion points. Now, back to the list...

~~Full query parsing (simply remove all special characters & lowercase)~~
~~Add timing mechanism~~
Restriction algorithms
Fix Lemur indexing issue with .key file
Write up website entries
- Documentation - including Javadocs
- Multisearch
- Installation
- Backends
- Limitations
- Downloads
- Papers
- Site Index
~~Load all backends from Snowy/Pileus~~
Perhaps add a few new Lemur backends if some gov2 are missing

The server-selection sections are working, except the cleanup isn't. I need to delete the files generated so the next search can generate their own files. I know that a directory needs to be emptied before it can be deleted, so I am trying some stuff...

Monday, 11 August
The home stretch!

Full query parsing (simply remove all special characters & lowercase
Add timing mechanism
Restriction algorithms
Fix Lemur indexing issue with .key file
Write up website entries
- ~~Introduction~~
- Documentation - including Javadocs
- Multisearch
- Installation
- Backends
- ~~Components~~
- ~~Architecture~~
- Limitations
- Downloads
- Papers
- Site Index
~~Send off TREC paper writing to Chris~~
Load all backends from Snowy/Pileus
Perhaps add a few new Lemur backends if some gov2 are missing

Arctic Region Supercomputing Center
PO Box 756020, Fairbanks, AK 99775

© Arctic Region Supercomputing Center 2006-2008. This page was last updated on 16 August 2008.
These files are part of a portfolio for Kylie McCormick's online resume. See the disclaimer for more information.