Mar 26

Reality Buzz LogoWhile web data, especially social media data, grows exponentially, the vast array of opportunities for using real-time web data to improve analysis and decision making is limited only by your imagination.  These days, companies must incorporate Web data into their intelligence and analysis tools in order to compete. In some industries it’s a matter of survival.

Real-time data is where the answers are.  It’s where market and customer trends are immediately identifiable.  It’s where deals will be won and where winners will claim their trophies.

Reality Buzz

We recently built something 30M Americans can relate to – a way to predict American Idol and other reality show results based on data harvested from popular social media sites.

Every week on American Idol, contestants perform, their fans dial-in their support for their favorites, and the next day contestants are voted off the show.  During the performances, and for several hours after, fans tweet about and discuss their favorites online, showing support for the ones they want to see voted through to the next show.

We scrape thousands of pieces of web data from twitter, Facebook, forums and discussion sites around the web, apply sentiment analysis, analyze the data, and make predictions about the person(s) to be eliminated from the show, all in the span of a few hours.

We built the robots (automated web data collection processes) in a matter of hours.  Now they are automated to collect the data, transform unstructured data into structured data, and load it into a MySQL database.

In the last two weeks starting with the top 12 contestants, we’ve successfully predicted the American Idol contestant to be eliminated hours before the elimination show aired.  For more information on our latest predictions and to learn more, please visit Reality Buzz on Facebook.

Imagine what you could do for your business with Kapow’s Web Data Server and a few hours creating Kapow robots.  Real-time web data can fuel predictive analytics capabilities to give your company an unfair advantage.

Over 400 Kapow customers are jumping in with both feet.  What’s stopping you?  Learn more on our Kapowtech.com website, or contact us for a Free Trial of the Kapow Web Data Server.

By:  Rick Kawamura Rick Kawamura

  • Share/Bookmark
Tagged with:
Nov 23

As a frequent traveler I often stay at Marriott hotels and noticed they recently upgraded and redesigned their website.

Why is this interesting in a blog about Web Data Services?

Marriott has just done what so many other companies are doing these days – they are modernizing their website with a user-friendly, dynamic AJAX-based interface to enhance user experience.

While AJAX is helpful for creating interactive web applications like a hotel reservation system, it’s very bad news for business users who depend on collecting web data with home grown scripts and primitive web scrapers to empower business applications like Market Intelligence, Financial Research, and Buzz Analytics.

But don’t despair. Kapow Technologies just released version 7.1 of Kapow Web Data Server which includes support for even the most sophisticated AJAX toolkits, including Google Web ToolKit.

That said I couldn’t write this blog post without testing it out myself.

So I powered up RoboMaker 7.1, typed in www.mariott.com, and did a simple search for hotels near San Francisco airport.

Then a dynamic map (powered by Microsoft bing) appeared showing the locations of the 10 nearest Marriott hotels.  I wanted to create a loop over the ten hotels on the map, so I simply clicked on hotel number 1, clicked the insert loop command, and in a few minutes I had created a Kapow robot that could extract hotels directly from a highly dynamic, AJAX-based map.

Tell me about any other product on the planet that can do this in 2 minutes!

Check out the picture below, and be sure to think about Kapow Web Data Server when your current Web Data Extraction tools break in the world of modern AJAX powered web sites.

Marriott Map

By:  Stefan Andreasen Stefan Andreasen, Founder and CTO

  • Share/Bookmark
Tagged with:
Jul 30

Often without realizing it, more and more companies rely on Web Data (any data you can see in a web browser) as a critical foundation for making business decisions.

Ron’s post on Web Data reminded me of this interesting blog post, “More data usually beats better algorithms”, written by Anand Rajaraman, co-founder of Kosmix and also Consulting Assistant Professor of Data Mining at Stanford University.

MoneyFallingThe blog post describes how Anand’s students competed for the $1 Million Netflix Prize, a competition open to the public.

Netflix provides a huge data set of customer movie ratings from the past, and the challenge is to use this data to create a better algorithm than Netflix already has to predict which movies people want to view in the future.

Anand’s students attacked this challenge and in his post he highlights two very different approaches.  Team A focused on developing a sophisticated algorithm.   Team B used a simple algorithm and focused more on the data, pulling in additional movie data from IMBD (International Movie Database).

Which team performed better?

Team B, who focused more on the data, got to the top of the Netflix Prize leaderboard.

Anand’s point?  “…adding more, independent data usually beats out designing ever-better algorithms to analyze an existing data set. I’m often suprised that many people in business, and even in academia, don’t realize this.”  Just adding one extra set of data can improve the quality of your decision making several times over.

The key is not about selecting between a better algorithm or better data, but about improving the outcome of your decision-making by adding more data, namely Web Data. Think about the impact to your business if you could add high-value Web Data to your Market Intelligence, Pricing Intelligence, Financial Intelligence or any other Business Intelligence product.

Many companies already have knowledge workers who cut-and-paste Web Data into their BI tools or use simple Web Scraping tools like Velocityscape, Connotate, QL2 or Mozenda (which are limited by their inability to handle dynamic web content like AJAX or JavaScript).  To get the most out of your Business Intelligence projects, you’ll want a full Web Data Services product like the Kapow Web Data Server.

Unleash the real power of Web Data to make better business decisions.

Check it out and let me hear your comments.

By:  Stefan Andreasen Stefan_Kapow_CTO

  • Share/Bookmark
Tagged with:
Jul 13

Scraping comes from “Screen Scraping” which is a term used for a set of products that turn old “Green Screen” mainframe applications into web services by “wrapping” the screen protocol.  Screen Scrapers connect up to the fields of a 32×80 character terminal and read pixels, text and numbers to fill in forms and in turn wrap the application into a programmatic interface or web service.  Examples of such products are IBM Rational HATS, Attachmate EXTRA.

Web Scraping is conceptually identical to Screen Scraping as it “wraps” a human interface into a programmatic interface, but instead of “wrapping” a character based mainframe protocol, it “wraps” a Web site or Web application and turns it into an API.

It sounds similar but technically, and in use cases, it’s quite different.

Web Scraping does not represent all approaches of wrapping Web applications into API’s – it’s limited to traditional methods that use script languages like PERL or Python to extract data from static HTML with regular expressions. This method of extracting data from web sites has been used for years, but it has been running into two growing challenges:  it’s fragile toward changes in the underlying web application, and more importantly, it simply does not work with today’s dynamic AJAX powered web sites.

If you are a PERL programmer I encourage you to build a simple “web scraper”. Go to Gmail.com and create a PERL script that can log in and read the content of your inbox. You will quickly find out that it is nearly impossible.

Let me introduce the Kapow Web Data Server – it takes over where fragile “Web Scraping” scripts fail, delivering a point-and-click interface to turn a website like gmail.com into a sharable REST or SOAP service in the cloud or on-premise, virtually in minutes. Web data access has never been easier and more resilient.

Web Scraping represents a business concept with growing value in today’s networked world, however, Web Data Serving has taken over to deliver a far more productive and robust alternative to traditional Web Scraping technologies.

I will be continuing with more blogs on this topic, and as always, I’d love to hear your comments.

By:  Stefan Andreasen Stefan_Andreasen_CTO

  • Share/Bookmark
Tagged with:
Jun 22

New IntelligenceI just read a very interesting article on “How New Intelligence Will Tame The Information Explosion” on CNBC.com. The article written by Steve LaValle and Jim Bramante from IBM describes how one in three business leaders cannot make the right decision. The reason? According to LaValle and Bramante, even though there is an abundance of information around, there is scarcity in getting the right data at the right time for making good decisions.

“New Intelligence” is vital for today’s CEO and other decision makers to make agile and accurate decisions. It’s simply impossible to drive a company to success based on intuition and a closed group of advisers.  You need to assemble the right proof points, analyze them, and make your move.  And this can’t be a one-off effort.  You need to build it into your management process to continuously have your finger on the pulse, ready to alter your business direction at any given moment.

Two critical components necessary for “New Intelligence”

First (and most important): the data. No decision-making is better than the data behind it.

Second:  Analyzing and reporting of the data. Fortunately, there is an ever growing set of very good analysis and reporting tools on the market today. Products from large companies like IBM Cognos, Oracle Hyperion, SAP Business Objects, Microsoft Fast, Autonomy, as well as numerous pure-play vendors like ClaraBridge, Corda, Attensity, and QlikTech.

One source of data that has become increasingly important today is web data. By web data I specifically mean data from public websites, including those of your competitors as well as from business partners.

Most web data is hidden behind a human browser interface and inaccessible by traditional enterprise applications. As a result, companies have been doing either manual cut-and-paste of the data or writing fragile “Web scrapers” in technologies like PERL or Python.

Unfortunately, those data acquisition methods are simply not scalable. Manual cut-and-paste can only access a fraction of the needed data while “Web scrapers” can only deal with static HTML pages and are insufficient to address the increasing amount of JavaScript and AJAX powered sites.

This is exactly what we have addressed at Kapow Technologies with the Kapow Web Data Server 7.0, the newest release of our flagship product. Using our proprietary, scalable, HTML parser and JavaScript engine in a point-and-click development environment, it is now a breeze to extract or service-enable any web data, even behind the most complex and dynamic web applications. Typically, it only takes as much effort and time as it would to click through a website once. After that, all the critical web data is automatically collected or accessed to empower your critical business decisions and drive your business forward.

I encourage you to check it out and let us know what you think.

By:  Stefan Andreasen Stefan_Andreasen_CTO

  • Share/Bookmark
Tagged with:
preload preload preload