links for 2009-09-15
-
Not long ago, I was invited to participate at a customer’s annual conference. It was an amazing experience. I’ve been to conferences of all sorts, but I confess I’ve never attended an event quite like this one. Let’s just say that I’m used to.. well… less energetic IT conferences. This particular company is *extremely* good at marketing and really understands the power of hype. The combination of pounding dance music, an elaborate stage set up, spectacular lighting, and, most importantly, well crafted and super hyped product announcements had the 20,000+ attendees in a frenzy.
-
A major trend we've written about before and the we see continuing over the next couple of years is the significant reduction in price for what are now best in breed technologies in the space. This is being driven by of couple of factors, including increasing functionality in open source alternatives Lucene and Solr; and the acquisition of FAST by Microsoft, with the anticipated integration of FAST ESP into SharePoint, which many feel will result in a much lower price point.
Lately, we've seen a few major vendors engaging in some pretty severe obfuscation in their licensing parameters. I'm not sure it's a remnant of the 'good old days', or a last-ditch attempt to extract as much revenue as possible before the inevitable collapse in licensing costs we've talked about before. Let me explain, by way of analogy. -
Solr Cell, a new feature in the soon to be released Solr 1.4, allows users to send in rich documents such as MS Word and Adobe PDF directly into Solr and have them indexed for search. All of the examples on the Solr Cell wiki page, however only demonstrate how to send in the documents using the curl command line utility, while many Solr users rely on SolrJ, Solr’s Java-based client. Thus, I thought I would throw up a quick example here (and I’ll update the Wiki) demonstrating how to do this.
-
A popular feature of most modern search applications is the auto-suggest or auto-complete feature where, as a user types their query into a text box, suggestions of popular queries are presented. As each additional character is typed in by the user the list of suggestions is refined. There are several different approaches in Solr to provide this functionality, but we will be looking at an approach that involves using EdgeNGrams as part of the analysis chain. Two other approaches are to use either the TermsComponent (new in Solr 1.4) or faceting.
links for 2009-09-09
-
Below we have compiled - in no particular order - 50 things that are in the process of being killed off by the web, from products and business models to life experiences and habits. We've also thrown in a few things that have suffered the hands of other modern networking gadgets, specifically mobile phones and GPS systems.
-
Conference presentations about search and retrieval, text mining, and content processing are often little more than sales pitches. In the last 30 years, I've met a number of people who have made significant contributions to information retrieval. What I want to do is periodically interview some of the more interesting "search wizards". Most of these people do not think of themselves as "wizards". Most work dilligently to fill in the gaps in their knowledge. Search is a complicated discipline, and it is a full-time job to keep up with developments.
links for 2009-09-08
-
The third release candidate for Lucene 2.9 is about to hit and the final release is likely to be only days behind. Almost one year in the making, Lucene 2.9 is feature packed and progressively faster. With Solr 1.4 planning to release very shortly after 2.9, things are shaping up very nicely in Lucene land. Congrats to all the devs involved in both releases – I really think this is the culmination of some really fantastic work from both projects.
links for 2009-09-04
-
The patent application contains a single illustration of the familiar Google.com user interface which, as we know, is quite spartan. In other words, Google essentially owns the concept of putting a big search box on top of two buttons and putting some text links nearby.